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I. 


INTRODUCTION 


In  the  defense  conununity,  military  and  civilian  analysts 
are  frequently  confronted  with  problems  in  which  one  or  more 
objectives  are  to  be  optimized  subject  to  resource  con¬ 
straints.  By  their  very  nature,  many  military  problems  are 
very  complex  and  highly  unstructured.  As  a  result,  the 
analyst  is  faced  with  many  decisions  including:  what  is  a 
suitable  measure  of  effectiveness;  what  are  the  effective 
constraints;  what  assumptions  need  to  be  made;  can  the  orig¬ 
inal  problem  be  reduced  or  transformed  to  a  simpler  model 
which  is  easier  to  solve;  how  sensitive  are  the  results  to 
the  underlying  assumptions? 

It  is  the  concern  of  this  thesis  to  investigate  various 
modelling  choices  and  various  modelling  decisions  in  order 
to  guide  the  analyst.  Modelling  choices  include  the  actual 
form  of  the  underlying  mathematical  model,  for  example,  goal 
programming,  separable  programming,  or  linear  fractional 
programming.  Modelling  decisions  describe  the  operations 
that  are  performed  on  the  mathematical  model  once  a  specific 
formulation  is  chosen.  Included  in  this  category  are  trans¬ 
formation  of  variables,  scale  changes,  translation  and 
rotation  of  coordinates.  Chapter  II  will  introduce  termin¬ 
ology  and  classification  of  mathematical  programming 
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problems;  Chapter  III  will  illustrate  some  modelling  choices; 
Chapters  IV  and  V  will  describe  the  various  transformations 
and  scaling  operations  examined  in  this  thesis. 

The  scope  of  the  research  involved  in  this  study  is  an 
examination  of  the  applicable  mathematical  programming  liter¬ 
ature  regarding  modelling,  transformations,  and  scaling, 
followed  by  the  testing  of  specific  ideas  on  two  commercially 
available  nonlinear  programming  codes.  The  literature  search 
revealed  that  little  groundwork  has  been  done  in  this  area, 
most  of  what  has  been  found  is  dated;  and,  furthermore,  no 
work  of  this  kind  has  been  done  on  commercial  constrained 
optimization  codes.  The  two  codes  in  question  are  General¬ 
ized  Reduced  Gradient  (GRG)  of  Lasdon,  Waren,  Ratner,  and 
Jain  /!/  /2/,  and  Sequential  Unconstrained  Minimization 
Technique  (SUMT)  of  Fiacco  and  McCormick  73/  74/  757.  System 
documentation  for  this  GRG  code  was  dated  November,  1975  767. 

For  an  analyst  who  has  not  gained  much  experience  with 
constrained  optimization,  the  chapter  on  specific  modelling 
choices  will  illustrate  the  types  of  conversions  that  can 
be  made  on  nonlinear  problems  to  get  them  into  a  form  where 
a  commercial  linear  programming  package  can  be  used. 

The  main  thrust  of  the  computational  experiments  was 
to  take  a  few  well  known  test  problems  for  which  the  optimum 
solution  was  known,  transform  or  scale  the  problem  in  some 
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manner,  and  determine  what  effect  the  change  had  on  the  code. 
It  needs  to  be  emphasized  that  SUMT  and  GRG  are  just  two  of 
several  constrained  optimization  codes  that  have  been  devel¬ 
oped  in  recent  years.  Without  codes  of  this  type,  the 
analyst  can  still  solve  mathematical  programs  containing 
nonlinearties  by  using  a  linear  program  that  approximates 
the  nonlinear  terms.  The  state  of  the  art  in  unconstrained 
nonlinear  optimization  techniques  is  highly  developed,  but 
in  constrained  nonlinear  optimization,  although  the  body  of 
theory  is  large,  the  area  of  technique  has  evolved  slowly. 

The  transformations  of  variables  discussed  in  Chapter 
IV  can  be  used  to  transform  some  nonlinear  problems  into  un¬ 
constrained  or  partially  constrained  problems.  If  only  a 
few  constraints  can  be  eliminated  from  a  problem,  it  can 
make  problems  easier  to  solve  using  the  SUMT  code.  On  the 
other  hand,  empirical  results  presented  in  this  thesis  in¬ 
dicate  that  the  same  transformations,  when  applied  to  test 
problems  solved  by  GRG  or  SUMT,  made  the  test  problems  more 
difficult  to  solve.  Computation  time  was  considerably  in¬ 
creased  using  transformations,  and  in  numerous  experiments, 
the  GRG  code  could  not  find  a  feasible  point.  Since  these 
transformations  actually  restrict  the  variables  to  be  non¬ 
negative  or  take  on  values  in  a  certain  range,  a  given 
problem  can  be  modified  significantly  by  a  change  of 
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variables.  In  addition,  transformations  may  cause  some 
local  optima  to  be  lost. 

Transformations  to  obtain  separability  of  variables  is 
also  discussed  in  Chapter  IV  and  includes  a  description  of 
a  diagonalization  algorithm  to  transform  quadratic  expressions 
into  sums  of  squares.  The  results  of  experiments  that  were 
conducted  in  diagonalizing  quadratic  forms  of  different 
dimensions  are  included  to  give  the  analyst  an  idea  of  the 
time  trade-off  that  he  must  make  to  get  such  an  expression 
into  separable  form.  There  is  also  a  discussion  of  barrier 
and  penalty  function  transformations,  of  which  the  SUMT  code 
used  in  the  experiments  is  a  prime  example. 

The  final  chapter  examines  the  use  of  scaling,  rotation, 
and  translation  operations  and  discusses  the  sensitivity  of 
the  GRG  code  to  these  techniques.  The  empirical  results 
from  the  test  problems  considered  indicate  that  proper  scal¬ 
ing  is  critical  to  the  successful  utilization  of  GRG.  This 
code  is  also  highly  affected  by  attempts  at  rotation  and 
translation  of  coordinates.  Multiple  rotation  operations 
on  a  test  problem  having  nonlinear  equality  constraints 
made  that  problem  much  harder  to  solve.  The  translation 
experiments  indicate  that  translations  will  increase  the 
amount  of  time  and  the  number  of  iterations  required  to 
find  a  solution. 
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The  principles,  maxims,  and  heuristics  presented  here 
will  not  guarantee  success,  but  if  adhered  to,  should  pro¬ 
vide  the  analyst  with  guidance  when  attempting  to  solve  con¬ 
strained  optimization  problems.  The  codes  do  not  guarantee 
the  correct  optimum  solution,  and  in  some  cases,  will  not 
construct  a  solution  even  though  one  may  exist.  Each  com¬ 
puter  code  has  a  number  of  adjustable  prameters,  and  these 
adjustments  in  turn,  affect  the  efficiency  of  the  correspond¬ 
ing  algorithm.  There  are  recommended  average  values  for  the 
parameters  required  by  a  specific  code,  but  choosing  these 
values  may  prevent  the  code  from  operating  at  maximum  ef¬ 
ficiency  for  a  given  problem.  A  final  caution  on  the  codes 
is  in  order.  They  have  been  fine  tuned  on  a  number  of  well- 
known  properly  scaled  test  problems  of  varying  degrees  of 
difficulty.  The  performance  of  these  codes  on  a  real-world 
problem  is  highly  dependent  upon  correct  formulation  and 
proper  scaling  by  the  analyst. 

As  a  prelude  to  the  remainder  of  this  study,  consider 
the  weapon  allocation  problem  developed  by  Koopman  /I / , 
which  is  an  example  of  a  nonlinear  optimization  problem. 

The  basic  model  is  one  that  maximizes  the  expected  damage 
subject  to  the  total  number  of  weapons  available,  and  in 
its  simplest  mathematical  form  can  be  expressed  as: 
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where : 

V^=  value  of  j  target 

x^j=  number  of  weapons  of  type  i  allocated  to  target  j 

=  total  number  of  weapons  of  type  i 

)Ji.  .=  constant  which  incorporates  the  probability  of 
hitting  target  j  with  weapon  i,  and  the  rate  at 
which  target  value  decreases  with  each  direct 
hit 

This  allocation  problem  will  be  referred  to  again  in 
later  chapters.  An  analyst  trying  to  solve  this  problem  can 
either  apply  transformations  to  get  the  problem  into  separ¬ 
able  form,  or  try  to  solve  it  directly  with  a  code  like  GRG 
or  SUMT.  His  approach  will  be  greatly  influenced  by  the 
programming  codes  available  to  him. 
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II.  THE  MATHEMATICAL  PROGRAMMING  PROBLEM 


A.  DEFINITION  OF  THE  PROBLEM 

T 

Let  x=  . /X^)  represent  a  vector  in  a  space 

of  n  dimensions.  Let  f (x) /  and  j=l,2,...m,  be  func¬ 

tions  defined  in  the  vector  space.  The  general  mathematical 
programming  problem  is  to  find  a  21*  such  that  f(21*)  will  be 
the  maximum  or  minimum  of  f(2l)  under  the  constraining  con¬ 
ditions  : 

9j  (x*)  ±  0  j=l»2, . ,m 

B.  TERMINOLOGY 

Before  proceeding,  it  is  necessary  to  specify  some 
terminology  appropriate  to  the  mathematical  programming 
problem  that  is  used  throughout  this  thesis. 

The  21  vector  is  an  n-dimensional  vector  of  unknown 
problem  variables.  Specification  of  an  21  vector  determines 
a  point  in  n-dimensional  space  and  also  determines  a  value 
for  f(2i)  and  g^  (21)  ♦ 

The  gj(2l)  in  inequality  (2)  are  called  constraints, 
and  form  a  closed  region  in  n-dimensional  space,  thus  limit¬ 
ing  values  of  21  to  points  in  the  closed  region. ■ 

The  closed  region  defined  by  the  constraints  is  called 
the  feasible  region  for  a  given  problem.  A  feasible  point 
is  any  point  2I  that  lies  in  the  feasible  region.  Points 
outside  this  region  are  called  non-feasible. 
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The  function  f(2c)is  called  the  objective  function  of 
the  problem.  Mathematical  programming  attempts  to  optimize 
f(x)  over  the  feasible  region  defined  by  inequality  (2). 

Optimal  solutions  are  not  necessarily  unique.  More 
than  one  solution  may  have  the  same  minimum  or  maximum  value. 
Necessary  conditions  for  optimality  are  conditions  that  an 
optimal  solution  must  satisfy,  but  that  other  nonoptimal 
solutions  may  also  satisfy.  A  sufficient  condition  for 
optimality  is  one  that,  if  satisfied  by  a  given  solution, 
guarantees  that  the  given  solution  is  optimum.  For  many 
general  problems,  conditions  that  are  both  necessary  and 
sufficient  cannot  be  determined;  the  best  that  can  be  done 
is  to  show  that  a  given  solution  is  a  local  optimum.  A 
local  minimum  (maximum)  is  any  feasible  point  such  that  any 
small  perturbation  around  it,  still  remaining  in  the  feasible 
region,  will  increase  (decrease)  the  value  of  the  objective 
function.  The  global  optimum  is  that  local  optimum  for 
which  the  objective  function  has  its  smallest  (largest) 
value. 

Co  CLASSIFICATION  OF  MATHEMATICAL  PROGRAMMING  PROBLEMS 

Mathematical  programming  problems  are  generally  classi¬ 
fied  on  the  basis  of  the  mathematical  form  of  f(x)  and  g^ (x) • 
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Linear  programming  results  when  both  f(x)  and  the  (x) 
are  linear  functions  of  x.  A  linear  function  has  constant 
values  for  its  partial  derivatives  with  respect  to  x»  and 
thus  has  a  constant  gradient.  Linear  programming  is  the 
most  well  known  mathematical  programming  problem,  and  com¬ 
puter  routines  using  the  simplex  algorithm  are  widely  avail¬ 
able  and  can  handle  thousands  of  variables  and  constraints. 

Goal  programming  is  a  simple  extension  of  linear  pro¬ 
gramming  which  attempts  to  handle  multiple,  and  frequently 
conflicting  goals.  The  goal  programming  formulation  is  dis¬ 
cussed  in  the  next  chapter.  The  goal  programming  approach 
is  to  combine  the  multiple  goals,  weighted  by  appropriate 
factors,  into  a  single  objective  function  that  is  to  be 
optimized.  Lee  /8/  /9/  has  made  numerous  applications  of 
goal  programming  dealing  specifically  with  the  problem  of 
handling  a  hierarchy  of  goals,  in  which  the  most  important 
goal  has  a  much  higher  priority  than  lower  level  goals. 

When  f(x) and/or  the  g^ (x)  are  not  linear  functions  of 
X,  the  problem  is  called  a  nonlinear  programming  problem. 
Definitive  general  statements  of  necessary  and  sufficient 
conditions  are  available  only  for  limited  cases.  For  the 
special  case  of  a  convex  nonlinear  programming  problem  in 
which  the  objective  function  and  constraints  are  convex, 
the  necessary  conditions  that  an  optimal  solution  must 
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satisfy  are  commonly  referred  to  as  the  Kuhn-Tucker  con¬ 
ditions.  References  /iP/  /ll/  A3/  provide  good  de¬ 

scriptions  of  the  Kuhn-Tucker  conditions. 

When  f(x)  is  a  quadratic  function  of  x  and  the  g^ (x) 
are  linear,  the  problem  is  called  a  quadratic  program.  In 
principle,  this  problem  is  almost  as  easy  to  solve  as  the 
linear  programming  problem,  and  differs  from  it  mainly  in 
that  the  gradient  of  f(x)  is  a  linear  function  of  the  x. 

In  practice,  the  quadratic  programming  problem  is  solved 
either  by  conversion  to  an  approximate  linear  programming 
problem,  or  by  solving  it  directly  using  a  nonlinear  pro¬ 
gramming  algorithm. 

If  the  constraints  are  linear  and  the  nonlinear  ob¬ 
jective  function  can  be  expressed  as  the  ratio  of  two  linear 
functions,  a  special  mathematical  programming  problem  known 
as  linear  fractional  programming  results.  It  can  be  solved 
by  computing  the  optimum  solution  to  at  most  two  linear  pro¬ 
grams  . 

Many  nonlinear  problems  can  be  solved  by  the  use  of 
some  linearizing  technique  followed  by  the  application  of 
the  simplex  algorithm.  One  such  technique  is  called  separable 
programming  which  requires  that  a  nonlinear  function  be 
separated  into  a  sum  of  several  terms,  each  of  which  is  a 
function  of  a  single  variable.  These  terms  are  linearized 
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by  calculating  their  values  over  a  grid  of  points  in  the  con¬ 
vex  region.  The  simplex  algorithm  is  then  applied  to  the 
linearized  problem. 

Separable  programming,  goal  programming,  and  linear 
fractional  programming  can  be  converted  into  problems  solvable 
by  the  simplex  algorithm,  and  will  be  discussed,  in  turn,  in 
the  next  chapter. 
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III.  MODELLING  CHOICES 


The  previous  chapter  was  intended  to  present  an  over¬ 
view  of  the  modelling  choices  available  to  the  analyst  to 
model  a  given  mathematical  programming  problem.  As  the 
analyst  formulates  his  model  he  must  bear  in  mind  the  trade¬ 
off  between  simplicity  and  accuracy.  If  he  oversimplifies 
his  model  by  assuming  away  nonlinearities,  he  may  wind  up 
with  a  linear  programming  model  that  gives  poor  results  and 
is  not  representative  of  the  actual  problem.  If  he  attempts 
to  include  every  detail  of  the  problem,  the  formulation  may 
become  so  complex  that  the  model  becomes  incomprehensible. 

The  weapon  allocation  model  cited  in  Chapter  I  is  an 
example  of  a  simple  model  formulation  that  can  easily  be¬ 
come  a  formidable  problem  by  making  a  few  modifications. 

A  descriptive  problem  studied  by  Bracken  and  McGill 
illustrates  how  difficult  this  problem  can  become.  The 
application  involves  the  targeting  of  sea-launched  ballis¬ 
tic  missiles  on  strategic  bomber  bases.  The  objective  is 
to  allocate  submarines  to  possible  launch  areas  and  to 
find  a  targeting  pattern  against  the  bomber  bases  so  as  to 
maximize  the  numbers  of  bombers  destroyed.  There  are  techno¬ 
logical  constraints  which  prevent  launching  all  of  the 
missiles  simultaneously.  Furthermore,  the  flight  time  of 
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a  missile  depends  upon  the  distance  between  the  launch  point 
and  the  target.  Consequently  the  enemy  can  scramble  some  of 
his  bombers,  with  the  number  scrambled  increasing  with  time. 
The  objective  function  will  have  to  be  modified  further  if 
the  missiles  destroy  only  part  of  the  bomber  bases.  These 
features  of  time-phased  allocations  and  time  deteriorating 
values  significantly  alter  the  model  formulation.  The  same 
authors  offer  other  interesting  defense  applications  in 
reference  /,1^. 

There  is  a  direct  interaction  between  the  mathematical 
programming  problem  and  the  techniques  available  to  solve  it. 
The  state  of  the  art  in  nonlinear  programming  is  such  that 
very  large  problems  can  be  handled  only  in  special  cases  for 
problems  having  special  structure.  The  generally  available 
codes  are  currently  limited  to  about  100  variables  because 
of  excessive  computational  time  and  excessive  storage  re¬ 
quirements.  There  is  also  a  tradeoff  between  obtaining  an 
exact  solution  to  an  approximate  problem,  or  an  approximate 
solution  to  an  exact  problem. 

In  this  chapter  three  special  purpose  nonlinear  pro¬ 
gramming  problems  are  discussed  that  can  be  solved  by  linear 
programming  methodology.  Separable  programming  is  the  only 
one  of  the  three  that  is  widely  known.  Since  it  uses  large 
scale  linear  programming  codes,  it  can  be  used  to  solve 
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nonlinear  problems  having  thousands  of  variables  and  con¬ 
straints.  Goal  programming  and  linear  fractional  program¬ 
ming  are  two  other  models  that  can  also  be  solved  by  com¬ 
mercial  linear  programming  codes.  However,  the  selection 
of  one  of  these  codes  is  not  necessarily  the  best  approach 
to  use  for  a  given  problem  if  the  problem  is  of  moderate 
size.  These  codes  do  have  the  following  advantages: 

1.  They  are  easy  to  prepare. 

2.  They  can  be  of  very  large  size. 

3.  They  can  be  used  routinely  in  that  they  are 
solvable  by  generally  available,  well-documented 
linear  programming  codes. 

4.  It  is  easy  to  perform  a  sensitivity  analysis  on 
the  variables  and/or  parameters  of  the  problem 

to  determine  the  effect  of  changes  in  these  quan¬ 
tities  . 

The  discussion  of  these  models  in  this  chapter  should 
provide  a  framework  for  the  analyst  that  will  assist  him  in 
choosing  a  specific  model. 

A.  SEPARABLE  PROGRAMMING 

Separable  programming  is  used  to  obtain  an  approximate 
solution  to  nonlinear  problems  having  a  separable  objective 
function  and  constraints.  A  separable  function  is  one  that 

7 

can  be  written  in  the  form  f(x)=4-*  f.(x.),  where  f.(x.)  is 

— ^  111  11 

a  function  of  a  single  variable  x^.  The  mathematical  formu¬ 
lation  is: 
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maximize 


(3) 


n 


subject  to: 


t  m  m  •  m 


X.  >  0 
1  “ 


The  separable  problem  is  reduced  to  a  linear  problem 
by  approximating  each  separable  function  by  a  piecortdse  linear 
function.  There  are  several  excellent  references  that  pro¬ 
vide  a  thorough  description  of  the  procedure  for  converting 
the  separable  problem  to  an  approximate  linear  problem. 
Dantzig  ,  Hadley  Miller  /!§/,  and  Beale  lyij  /IS/ 

are  included  on  this  list. 

The  weapon  system  allocation  problem  (1)  can  be  formu¬ 
lated  as  a  separable  programming  problem  by  introducing  the 


new  variable  z.= 


x^ j .  The  problem  therefore  be¬ 


comes  : 


maximize 


(4) 


n 


subject  to: 


z  .  - 


^ ""  1/2  /  .... 


or  equivalently 


minimize 


L 


V.e 


j=l 


subject  to: 


(5) 


j=l, 2, . . . .n 


I 


3=1 


X.  .  <  N. 
1]  -  1 


r  l/2^«*** m 


X.  .  >  0 
i:  *" 

Problem  (5)  can  now  be  approximated  by  piecewise  linear 
functions.  A  section  in  the  following  chapter  on  transfor¬ 
mations  describes  various  methods  of  converting  nonlinear 
terms  of  several  variables  into  separate  terms  of  single 
variables . 

Some  motivations  for  dealing  with  separable  programming 
are  as  follows: 

(1)  in  the  case  of  convex  problems,  it  allows  the 
use  of  large  scale  linear  programming  codes; 

(2)  in  the  case  of  nonconvex  problems,  it  allows 
the  use  of  linear  programming  codes  with  a  separable  option, 
for  which  there  are  special  basis  entry  rules  (see  Beale 
1^2/  for  a  discussion  of  these  rules) ; 

(3)  it  is  easy  to  adapt  the  separable  formulation  to 
a  constrained  nonlinear  programming  code  such  as  SUMT  (i.e., 
it  is  much  simpler  to  compute  the  analytical  derivatives 
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required  by  SUMT  when  the  variables  are  separated) ; 

(4)  by  removing  the  interaction  between  variables, 
it  is  easy  to  see  the  effects  of  transformations  on  the 
variables. 


B.  GOAL  PROGRAMMING 


Goal  programming  is  a  useful  concept  when  multiple 
goals  are  either  in  direct  conflict,  or  can  be  achieved 
only  at  the  expense  of  other  goals.  A  simple  example  would 
be  a  model  that  involves  two  types  of  manpower  in  two  differ¬ 
ent  time  periods.  Assume  that  an  analyst  is  faced  with  the 
task  of  recommending  the  number  of  officers  and  enlisted 
personnel  to  recruit  for  a  special  program  in  the  next  two 
fiscal  years.  Assume  that  the  only  costs  involved  are  the 
salaries  of  the  recruits,  which  are  $15,000  for  officers  and 
$10,000  for  enlisted  men.  The  budget  for  the  program  is 
$4,000,000  for  the  first  fiscal  year;  $6,000,000  for  the 
second.  The  desired  goals  for  the  number  of  officers  is  50 
for  the  first  year,  75  for  the  second  year.  The  correspond¬ 
ing  minimums  are  10  and  60  for  the  two  years.  The  desired 
goals  for  enlisted  men  in  the  program  are  250  and  400,  with 
corresponding  minimums  of  100  and  175. 

A  goal  programming  formulation  is : 


minimize  |x^-50|  +  jx2-250|  + 


x^-15 


+  X  -400 
4 
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(6) 


subject  to:  ^  10 

>  100 
2  - 

Xj  i  60 

i  175 


where 


15x^  +  lOx^  ^  4000 
15x^  +  lOx^  <  6000 

x^=  number  of  officers  in  the  first  year 

X2=  number  of  enlisted  men  in  the  first  year 

x^=  number  of  officers  in  the  second  year 

x^=  number  of  enlisted  men  in  the  second  year 


Figure  1  illustrates  several  types  of  objective  functions 

which  can  be  classified  as  goal  programming  problems.  Figure 

1-A  is  an  absolute  value  function  having  asymmetric  weights. 

2 

Figure  1-B  is  a  quadratic  function  (x^-Gj)  ,  and  Figure  1-C 

is  a  piecewise  linear  function. 

FIGURE  1:  Typical  Goal  Programming  Models 
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The  goal  objective  function  consists  of  a  sum  of  func¬ 
tions  of  a  single  variable,  some  of  which  may  be  nonlinear. 


Therefore,  the  goal  programming  problem  is  reducible  to 
either  a  linear  program  or  separable  program  depending  on 
how  the  model  is  formulated. 

A  more  general  formulation  of  the  goal  programming 
problem  is: 

=  E 


minimize 


f(x)=  ^  f  .  (x.) 

“  3  3  1 


(7) 


subject  to:  A  x  ^  b 

X  >  0 


If  the  absolute  value  function  is  used,  the  mathe¬ 


matical  formulation  is: 


minimize 


r=i 


w . 
3 


c  X  -  G. 


(8) 


subject  to:  A  x  ^  b 

X  ^  0 

If  a  quadratic  function  is  used,  the  formulation  is; 


minimize 


subject  to: 


n 


w . 


E  : 


c  X  -  G. 


(9) 


Ax  ^  b 

X  >  0 


A  more  extensive  mathematical  treatment  of  goal  pro¬ 
gramming  can  be  found  in  Refs.  ;/19/  and  ^20/.  . 

To  handle  the  nonlinear  problem  described  in  (8) ,  slack 
variables  and  z^  are  added  to  the  problem  to  represent 
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positive  and  negative  deviation  from  each  goal  G ^ .  For  each 

goal,  one  or  both  of  these  slack  variables  will  equal  zero. 

Minimizing  the  sum  of  absolute  deviations  is  equivalent  to 

the  following  linear  program; 

n 


minimize 


w . 

j=i 


y .+  z  . 
:  D 


(10) 


subject  to: 


c  x-y.  +z.=G. 
“  “  :  :  : 

A  X  <  b 


j=l, 2, . . . .n 


X ,  ^  ,  z  ^  0 


The  nonlinear  problem  described  in  (9)  can  be  solved  by 
making  piecewise  linear  approximations  to  each  such  term  in 
the  objective  function,  and  using  the  technique  of  separable 
programming. 

The  manpower  planning  model  described  previously  can  be 
reformulated  as  the  following  linear  program; 


minimize 

+y2+z 

2‘'^3 

subject  to; 

(11) 

Xi  -y^+z^  ^ 

=  50 

^1 

>  10 

=  250 

^2 

>  100 

Xj  -yj+Xj 

=  75 

^3 

>  60 

4-^4  4 

=  400 

^4 

>  175 

15  x^  +  10 

^2  ^ 

4000 

X  . 
1 

,y .  z .  >  0 

1/  1  “ 

15  x^  +  10 

^4  ^ 

6000 

i=l, .... ,4 
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A  real  world  example  of  a  goal  programming  model  is  one 
that  was  done  by  Major  Calvin  Anderson  of  the  U.S.  Army  Con¬ 
cepts  Analysis  Agency,  and  by  professors  G.H.  Bradley  and 
G.G.  Brown  of  the  Naval  Postgraduate  School.  The  model 
determined  the  optimum  distribution  of  officers,  by  rank,  in 
various  specialties,  with  the  ideal  utilization  equal  to  0.5 
in  primary  and  secondary  specialties.  The  total  number  of 
variables  was  437,  and  the  objective  was  to  minimize  the 
sum  of  deviations  between  the  actual  and  desired  utilization 
in  the  specified  billets.  The  problem  was  solved  using  both 
absolute  value  and  piecewise  linear  (6  to  100  segments  per 
function  approximated)  approximations  to  a  quadratic  goal 
function  on  a  FORTRAN  network  code  called  GNET  /2^.  This 
asset  utilization  model  illustrates  both  goal  and  separable 
programming  concepts.  In  increasing  the  number  of  segments 
to  approximate  each  function  from  2  to  6-100,  the  number  of 
arcs  in  the  model  increased  from  874  to  4334,  and  the 
solution  time  increased  from  2.75  to  36.87  seconds. 

C.  LINEAR  FRACTIONAL  PROGRAMMING 

Fractional  programs  result  when  rates  such  as  target 
value  destroyed  to  weapons  expended,  or  retention  rate  of 
aviators  over  a  time  planning  horizon  are  to  be  optimized. 
The  general  mathematical  formulation  is: 
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maximize 


f(x) 

q(2L) 


(12) 


subject  to:  ^ 


j=l,2. 


.m 


An  excellent  theoretical  development  of  fractional  pro¬ 
gramming  can  be  found  in  Ref.  /22/ 

If  f  and  q  are  linear  and  the  constraints  are  linear, 
then  (12)  is  called  a  linear  fractional  program,  and  can  be 


written  as : 


maximize 


d^x  +  y3 


(13) 


subject  to:  A  ^  b 

X  >  0 


Where CXandy3 are  real  numbers. 

The  linear  fractional  program  (13)  can  be  reduced  to 
a  linear  program  by  the  following  variable  transformation 
proposed  by  Charnes  and  Cooper  ^2^: 


y= 


X 


X 


(14) 


i  P 


'  if  €s.  *fi>  0 


The  resulting  linear  program  is: 


maximize 


T 


c  y  +CX  z. 


(15) 


subject  to:  Ay_-b^;^  0 

a\+/3z=  1 
^  >  0,  ^  >  0 
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If  X  +y3‘<  0,  this  transformation  will  not  work  but 


can  be  modified  easily  as  follows: 

1 


Z  =  - 


z  =  - 


d^  X  + 

1 

d^  X  +  /? 


X 


'  if  X  +/J 


<  0 


(16) 


The  new  linear  program  to  solve  becomes: 
.  .  T 

maximize  c^  Y,  "*■  ^ 

subject  to: 


^  0  ,  ^  >  0 


The  case  where  the  denominator  is  allowed  to  be  zero 
in  the  feasible  region  is  considered  in  Ref.  /,2^. 
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IV.  PROBLEM  AND  VARIABLE  TRANSFORMATIONS 


Once  the  analyst  decides  on  a  mathematical  model, 
which  may  be  dictated  by  the  codes  available  to  him,  he  is 
then  faced  with  the  decision  of  whether  to  apply  transfor¬ 
mations.  There  are  several  reasons  for  considering  the  use 
of  transformations.  First,  a  computer  program  for  the  so¬ 
lution  of  the  transformed  problem  may  be  readily  available. 
Second,  the  transformed  problem  may  require  less  computational 
time.  Third,  the  problem  as  initially  formulated  may  be  too 
difficult  to  solve  and  thus  require  an  approximation. 

Finally,  the  reformulated  version  may  also  provide  the  ana¬ 
lyst  with  more  insight  and  information  than  was  available 
from  the  original  problem. 

The  classification  of  transformations  that  are  of  in¬ 
terest  to  the  analyst  are  the  following; 

(1)  transformations  to  an  unconstrained  or  partially 
constrained  problem  by  a  change  of  variables; 

(2)  transformations  to  a  separable  programming  format 7 

(3)  transformations  to  an  unconstrained  problem  by 
barrier  or  penalty  functions. 

These  transformations  will  be  discussed  in  order  in  the 
following  sections  of  this  chapter  along  with  a  description 
of  the  experiments  used  to  test  them. 
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A.  TRANSFORiyiATIONS  TO  UNCONSTRAINED  OR  PARTIALLY  CON- 

STEAINED  PROBLEMS  BY  A  CHANGE  OF  VARIABLES 

Variable  transformations  to  eliminate  constraints  have 
received  little  attention  in  the  literature,  and  it  is  one 
area  of  mathematical  programming  that  is  much  in  need  of 
updating.  Very  little  empirical  work  has  been  done  in  this 
area.  In  this  section  transformations  recommended  for  con¬ 
version  of  constrained  optimization  problems  to  unconstrained 
or  partially  constrained  problems  are  discussed.  No  previous 
work  has  been  done  with  these  transformations  in  conjunction 
with  a  generally  available  nonlinear  programming  code. 

These  transformations  are  considered  here  because  an  appro¬ 
priate  choice  can  possibly  eliminate  some  constraints  and 
make  the  problem  easier  to  solve.  In  the  next  section,  the 
results  of  testing  these  transformations  on  the  GRG  and  SUMT 
codes  are  discussed. 

Constrained  optimization  problems  can  sometimes  be  re¬ 
duced  to  a  simpler  form  in  which  no  constraints  appear  ex¬ 
plicitly.  These  problems  can  then  be  solved  by  a  wide 
variety  of  unconstrained  optimization  techniques  which 
handle  general  nonlinear  functions. 

Box  ^25/  was  one  of  the  first  to  investigate  the  possi¬ 
bility  of  using  transformations  to  eliminate  linear  con¬ 
straints.  The  following  table  lists  linear  constraints 
and  some  of  the  change  of  variables  transformations  that  can 


be  made. 
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TABLE  1:  Change  of  Variables  Transformations  for 

Linear  Constraints 


Constraints 


Trans  formations 


X .  >  0 

1  ~ 


0  ^ 


2  Y- 

X .  =  y .  ;  X .  =e  i  ,*  x .  =  y . 

1  1  1  I  1  1 

2  y  ’ 

X ,  =  sin  y . ;  x .  =  e  ^ 

1  11  - 


/ .  <  X .  <  u . 
—1—1 


sin^  y^ 


-1  <  X.  <  1 
1 


X.  =  sin  y. 
1  1 


0  <  X .  <  X .  < 

—  1  ~  ^  ~ 


2  2  2 
X.  =  y.  ,  X.  =  y.  +  y.  / 

1  1  1  1  1 

2  2  9 


If  each  variable  in  a  problem  has  constant  upper  and 
lower  bounds,  x^j^  then  the  feasible  region  consists 

of  a  rectilinear  n-dimensional  box.  Replacing  each  x^  by 
sin^  y^  means  that  an  unconstrained  optimum 
in  y_-  space  is  being  sought.  The  periodicity  of  optimal 
solutions  in  transformed  space  will  not  cause  any  difficulty 
if  small  step-size  adjustments  are  made  by  the  optimization 
technique. 

These  transformations  map  points  in  the  neighborhood  of 
Yq  in  Y“  space  into  the  neighborhood  of  x^  in  x  ~  space. 
Although  they  are  not  necessarily  a  1:1  mapping,  these 
transformations  cannot  introduce  any  additional  local  optima. 
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Experience  in  the  application  of  these  techniques  is 
limited,  but  the  possibility  of  their  use  should  be  kept  in 
mind  by  the  analyst  whenever  he  formulates  a  problem.  If 
even  a  few  constraints  can  be  eliminated,  it  should  help  a 
code  such  as  SUMT  in  which  the  constraints  are  included  in 
the  modified  objective  function. 

B.  COMPUTATIONAL  EXPERIENCE  WITH  VARIABLE  TRANSFORMATIONS 
There  were  three  test  problems  used  in  the  experiments 
on  the  GRG  and  SUMT  codes.  The  experiments  were  conducted 
on  the  NPS  IBM  360/67  computer  system  with  the  Fortran  H 
Compiler.  The  codes  were  loaded  on  the  same  data  cell  and 
both  used  double  precision  arithmetic.  Problems  were  taken 
from  the  appendix  of  Himmelblau's  text  on  applied  nonlinear 
programming  ,  and  are  given  in  Appendix  A  of  this  thesis 

along  with  the  original  source.  They  will  be  referred  to 
as  Himmelblau  problems  16,4,  and  20  respectively.  Before 
describing  the  experiments  performed,  a  few  words  are  in 
order  regarding  the  test  problems.  Problem  16  includes  a 
quadratic  objective  function,  of  nine  variables,  13  quadratic 
constraints,  and  one  upper  bound.  Problem  20  includes  a 
linear  objective  function  of  24  variables,  which  are  sub¬ 
ject  to  12  nonlinear  equality  constraints,  two  linear 
equality  constraints,  and  six  nonlinear  inequality  con¬ 
straints.  The  14  equality  constraints  make  it  a  very 
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difficult  problem.  Problem  4  has  a  nonlinear  objective 
function  of  10  variables  which  is  a  logarithmic  function. 

It  is  subject  to  three  linear  equality  constraints  and  all 
variables  have  lower  bounds.  The  starting  point  for  each 
test  problem  was  infeasible.  The  experiments  were  conducted 
so  that  the  same  starting  point  was  always  used  for  a  given 
test  problem.  Table  2  summarizes  pertinent  information  on 
the  three  test  problems . 

The  GRG  code  used  in  these  experiments  was  designed  to 
handle  small  or  moderate  size  problems  containing  up  to  100 
variables  and  100  constraints,  of  which  60,  at  most,  can  be 
binding  at  any  one  time.  This  limitation  is  based  upon 
storage  requirements.  Lasdon  is  currently  working  on  a  GRG 
code  that  will  handle  much  larger  problems.  The  SUMT  code 
is  limited  to  problems  having  less  than  100  variables  and 
less  than  200  constraints.  The  GRG  method  is  described  in 
Appendix  C,  and  the  reader  is  strongly  urged  to  review  this 
appendix  before  reading  the  following  discussion  of  empirical 
results.  The  SUMT  method  is  discussed  in  section  D  of  this 
chapter  in  conjunction  with  penalty  and  barrier  functions. 
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TABLE  2;  Test  Problems  Used  with  GRG  and  SUMT  Codes 


Problem  Number  16 


Number  of  variables  9 

Number  of  equality 
constraints ; 

linear  0 

nonlinear  0 

Number  of  inequality 
constraints 

linear  0 

nonlinear  13 


4 _  2^ 

10  24 


3  2 

0  12 


0  0 
0  6 


Number  of  lower  bounds  1 


10  24 


Number  of  upper  bounds  0 
CPU  time,  seconds: 

GRG  2.90 

SUMT  11.69 


0  0 


1.68  13.56 

-  240.0 


*0ptimum  still  not  reached  when  program  was  terminated 


The  GRG  method  is  one  that  follows  an  inequality  con¬ 
straint  very  closely  and  therefore  is  very  likely  to  ter¬ 
minate  at  a  local  rather  than  global  optimum.  The  experi¬ 
ments  performed  using  GRG  and  SUMT  are  listed  in  Appendix  B 
Experiments  #2,  #25,  and  #16  are  the  reference  runs  for 
problems  16,  4,  and  20  respectively  using  the  GRG  code.  Ex 
periments  28  and  31  are  the  reference  runs  for  problems  16 
and  20  respectively  using  SUMT.  No  reference  run  was  made 
for  problem  4  using  SUMT.  The  SUMT  code  was  not  used  ex¬ 
tensively  in  these  experiments  because  the  user  must  supply 
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the  gradient  and  Hessian  matrix  in  order  for  the  code  to  run 
efficiently.  It  was  felt  that  this  was  too  time  consuming 
(high  preparation  time)  and  error  producing  (  debugging  time 
excessive)  to  be  worthwhile.  GRG/  on  the  other  hand,  is 
very  easily  modified  to  account  for  a  change  of  variables. 

This  is  so  because  GRG  uses  finite  differencing  to  evaluate 
gradients,  although  the  user  can  provide  a  subroutine  with 
the  exact  analytical  derivatives. 

In  general,  problems  with  nonlinear  equality  constraints 
and  either  nonlinear  or  linear  inequality  constraints  are 
the  hardest  to  solve,  followed  by  linear  equality  constraints 
with  nonlinear  inequality  constraints,  and  lastly,  nonlinear 
or  linear  inequality  constraints.  Therefore,  of  the  three 
test  problems,  the  order  of  decreasing  difficulty  is  20,  4, 

16. 

Before  discussing  the  transformations  on  problem  16,  it 
must  be  noted  that  a  fixed  parameter  in  the  GRG  code  had  to 
be  modified  before  the  correct  optimum  could  be  obtained. 

A  parameter  called  TOL  in  the  DEGEN  subroutine  had  to  be 
tightened  in  order  to  obtain  the  correct  solution  for  problem 
16.  The  subroutine  DEGEN  is  called  when  the  basis  constructed 
is  degenerate.  The  change  was  suggested  by  Lasdon  for  this 
particular  test  problem,  and  will  not  work  for  other  problems. 
Experiments  1  and  2  show  the  effect  of  this  modification  on 
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the  basic  problem.  Experiments  3  through  15  show  the 
effects  of  various  transformations  on  problem  16.  In  experi¬ 
ment  3,  the  problem  was  modified  by  placing  lower  bounds  on 

all  variables,  but  it  yielded  only  a  local  optimum. 

2  2 

In  experiment  4,  the  transformation  x^  -y^  was 
attempted  which  again  stopped  at  a  local  optimum.  The 
problems  run  in  experiments  3  and  4  were  then  run  on  the 
SUMT  code  (experiments  29  and  30),  both  yielding  the  correct 
maximum  value  but  at  different  values  of  x.  Experiments  5 
and  6  were  then  repeats  of  experiments  3  and  4,  except  for 
the  starting  points,  which  were  taken  from  the  optimum  x  in 
the  SUMT  experiments.  Aside  from  these  two  experiments  all 
others  were  commenced  from  the  same  starting  point  by  modi¬ 
fying  the  input  vector  as  appropriate. 

In  experiments  7,  8,  and  14,  the  transformation  x^= 
jy^j  was  attempted.  The  first  two  of  these  were  with  and 
without  lower  bounds  on  y^  respectively,  both  yielding 
local  optima  which  differed  only  in  that  x^  and  x^  were 
reversed.  In  experiment  14,  the  initial  vector  was  the 
optimum  vector  from  experiment  2 (the  reference  run) ,  and 
no  lower  bounds  were  placed  on  the  y^.  No  feasible  point 
was  found  in  this  run  because  the  final  constraint  was  the 
only  one  not  satisfied  initially,  and  in  attempting  to 
satisfy  it,  the  step  size  was  reduced  to  zero  causing 
termination. 
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.  y  • 

In  experiments  9  and  10,  the  transformation  x^=e  ^  was 

tried,  first  without  lower  bounds  and  then  with  them.  Run¬ 
ning  the  problem  without  lower  bounds  produced  an  apparent 
local  optimum,  while  running  it  with  lower  bounds  kept  the 
search  outside  the  feasible  region.  The  reason  this  happened 


y  ‘ 

is  evident  from  a  sketch  of  e  ^  which  is  always  positive 


constraints  1,  3,  and  12. 

Experiments  11,  12  and  13  involved  the  transformation 
2 

x.=  sin  y..  The  first  two  were  run  without  and  with  lower 

bounds  on  respectively,  but  both  start  from  the  same 

initial  point  which  for  this  transformation  is  infeasible. 

To  see  why  this  is  so,  consider  constraint  1  with  x^  re- 
2  2 

placed  by  sin  and  x^  replaced  by  sin  y^.  This  makes 

the  constraint: 

4  4 

1-  sin  y^-  sin  y^?  ^  0 

Taking  the  gradient  of  this  constraint  with  respect  to 
yields  the  following  nonvanishing  components. 
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.ill. 


3 

-4(sin  y^)  (cos  y^) 

4 

-4 (sin  y^)  (cos  y^) 


Evaluated  atJT/2  radians,  both  terms  vanish  since  cos  {JT/2)=0. 
Thus  all  elements  of  the  gradient  of  constraint  1  vanish. 

The  same  holds  true  for  all  the  other  constraints  and  the 
objective  function.  In  experiment  13  a  different  starting 
point  was  used  (  y^=  77/'4,  i=l,  ....9),  and  was  successful  in 
reaching  a  local  optimum. 

e 

In  experiment  15,  the  transformation  x.=  - - 

^  e^i+  e  ^ 

was  tried  with  no  lower  bounds  on  y..  This  transformation 
constrains  x^  to  the  interval  (0,1).  As  with  the  previous 
transformation,  this  one  never  gets  going  because  the  re¬ 
duced  gradient  at  the  starting  point  is  zero  for  all  9 
variables. 

This  same  group  of  transformations  was  then  tried  on 
problem  20.  As  previously  noted,  experiment  16  is  the 
reference  run  for  this  problem  using  GRG,  and  the  optimum 
value  of  X  is  listed  there.  Experiment  17  handles  the 
lower  bounds  explicitly as  inequality  constraints.  Although 
it  arrives  at  the  optimum  point,  it  takes  70  per  cent 
longer  to  run  than  the  reference  experiment. 
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the  transformation  x.=  y. 

1  I  1 


In  experiment  18,  the  transformation  x.=y.  was  tried, 

11 

and  the  transformed  problem  succeeded  in  reaching  the  opti¬ 
mum  point  but  required  more  than  double  the  time  of  the 
reference  run.  The  transformation  attempted  was  x^=  jy^j 
in  experiment  19.  Here  too,  the  optimum  point  was  reached 
but  in  nearly  triple  the  reference  time.  In  experiment  20, 

was  tried  but  with  lower  bounds 
on  the  variables  removed.  No  feasible  point  was  found  with 
eight  of  the  14  equality  constraints  still  violated  at 
termination  of  the  program.  The  problem  became  much  more 
difficult  to  solve  by  allowing  the  y^  to  be  unrestricted, 

which  made  the  equality  constraints  more  difficult  to  satisfy. 

.  2 

The  transformation  x^  =  sin  y^  was  used  in  experiment  21 
and  the  optimum  point  was  found  in  the  transformed  problem 
in  2.76  times  the  reference  time. 

A  careless  error  was  made  in  experiment  22  using  the 

y 

transformation  x.=e  i.  Lower  bounds  were  left  on  all  24 

1 

variables,  but  no  compensation  was  made  for  the  fact  that 
negative  values  of  y^  were  necessary  to  start  from  the  same 
starting  point  as  in  the  other  experiments.  This  error  was 
corrected  in  experiment  23  by  the  same  transformation  but 
with  the  lower  bounds  removed,  however,  no  feasible  point 
was  found  and  the  program  was  making  very  slow  progress. 

In  experiment  24,  the  transformation  x.=e  ^  /(e  +  e  was 
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attempted,  but  was  unsuccessful  in  finding  a  point  that  would 
satisfy  all  constraints. 

Two  transformations  were  tried  on  problem  4.  The  ref¬ 
erence  run  and  optimum  point  is  listed  in  experiment  25. 

Vi 

The  transformation  x^=e  yielded  a  slightly  improved  ob¬ 
jective  function  in  experiment  26  and  a  different  optimum 
point,  but  also  took  slightly  longer  to  solve.  In  experi¬ 
ment  27,  the  transformation  x^=jy^jwas  used  and  it  was  found 
to  yield  the  same  optimum  as  the  reference  run,  and  in  about 
the  same  time.  The  transformation  x^=e  was  also  tried  on 
problem  4  using  the  SUMT  code  (experiment  32)  and  provided 
the  correct  optimum,  but  in  over  five  times  the  CPU  time 
required  by  GRG  for  the  same  transformed  problem. 

Note  from  experiment  31  that  SUMT  still  had  not  reached 
the  optimum  point  in  problem  20  after  four  minutes  of  CPU 
time.  Thus  none  of  the  transformations  considered  here 
were  tried  on  problem  20  using  SUMT  for  two  reasons: 

(1)  excessive  computation  time  to  reach  an  optimum  solution, 

(2)  excessive  preparation  time  to  determine  all  the  ana¬ 
lytical  first  and  second  derivatives  of  the  transformed 
problems . 

From  the  preceding  discussion,  the  following  obser- 

\ 

vations  should  be  kept  in  mind  when  considering  a  transfor¬ 
mation  of  variables  using  the  nonlinear  programming  codes 
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GRG  and  SUMT; 


( 1)  transformations  generally  make  a  nonlinear  program 
harder  to  solve  and  can  substantially  increase  the  com¬ 
puter  time  required; 

(2)  transformations  are  less  likely  to  cause  difficulty  when 
used  in  problems  subject  to  finite  lower  or  upper  bounds; 

(3)  starting  points  and  bounds  on  variables  must  be  handled 
and  adjusted  carefully  when  preparing  a  transformed 
problem; 

(4)  when  confronted  with  a  real-world  problem  whose  solution 
is  not  known  in  advance,  it  is  always  wise  to  try  several 
different  carefully  chosen  starting  points  to  determine 
if  the  solution  can  be  improved. 

In  view  of  these  observations,  it  can  be  concluded  that 
variable  transformations  have  an  adverse  effect  on  nonlinear 
programming  codes  such  as  SUMT  and  GRG,  and  that  they  should 
not  be  attempted  unless  the  codes  have  difficulty  in  reach¬ 
ing  a  solution  using  the  original  variables. 

C.  TRANSFORMATIONS  OF  VARIABLES  TO  OBTAIN  SEPARABILITY 

Once  the  analyst  decides  upon  separable  programming  as 
a  method  for  solution,  he  is  faced  with  the  problem  of  how 
to  transform  his  model  to  separable  form.  This  section  dis¬ 
cusses  some  of  the  transformations  that  can  be  used  to  elimi¬ 
nate  interaction  between  variables. 
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Converting  a  nonlinear  programming  model  into  an  approxi¬ 
mate  version  with  separable  functions  increases  the  size  of 
the  model  in  two  ways.  First,  the  separability  transfor¬ 
mation  introduces  new  constraints  and  variables.  Second,  the 
subsequent  linearization  expands  the  number  of  constraints 
and  variables  even  further. 

The  transformations  described  in  this  section  are  located 
in  several  references  including  Hadley  ^1^,  Beale  ^17/  ^18/, 
Wagner  ^21/,  and  McCormick  l2%/ ,  No  single  reference  pro¬ 
vides  a  complete  treatment  of  all  the  transformations  dis¬ 
cussed  here. 


1.  Transformations  for  Product  Terms  and  Exponential 
Expressions 

Any  product  term  of  the  form  x^x^  appearing  in  a 

constraint  or  objective  function  can  be  eliminated  by  defining 

two  new  variables  y.  and  y.  as  follows: 

1  3 


^i= 


X  .  +  X  . 

=  -Jz _ 1 


X  .  -X  . 

y.=  -i - 1 

^3  2 


(16) 


2  2 

Then  x^  Xj=  y^  -  y^  which  provides  a  separable  form  in  the 

new  variables.  For  every  product  term  in  the  problem  formu- 

2  2 

lation,  substitute  y^  -y^  ,  and  add  the  two  additional  con¬ 

straints  defined  in  (16).  Since  y^  involves  the  difference 
of  the  original  variables,  it  will  be  unrestricted  in  sign 


even  if  x^  and  x^  are  non-negative. 
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An  alternative  method  to  separate  is  a  log  trans¬ 

formation  which  can  be  generalized  to  handle  product  terms 
of  three  or  more  variables.  The  original  variables,  however, 
must  be  strictly  positive  since  In  0=  -  oo  ,  and  the  logarithm 
of  a  negative  argument  is  undefined.  Setting  y^=  x^  x^  and 
taking  the  natural  logarithm  yields: 


y  =  In  X .  +  In  x . 
k  1  j 


(17) 


In  the  original  problem  formulation,  each  x^x^  term  is  re¬ 
placed  by  y  ,  and  the  additional  constraint  (17)  is  imposed. 

iC 

If  the  only  nonseparable  term  in  the  problem  is  j  /  intro¬ 
duction  of  the  variable  y,  and  the  additional  constraint 

k 


will  result  in  a  separable  format. 

To  separate  expressions  of  the  form  exp. 


an  example  of  which  is  exp.  (ax^  +  bx^) ,  introduce  the  new 

variable  y  ,  and  take  the  natural  logarithm  as  follows: 

K1 


y^=  exp.  (ax^  +  bx^) 


In  y  =  ax.  +  bx 

K.  X  ^ 


(18) 


For  expressions  of  the  form  x^^  x .  ,  x^  >0,  x^  ^  0,  take 
the  natural  logarithm  and  proceed  as  follows: 


v.=x.+<  ,6’>0 

z  =  V  .  In  X . 

:  1 

V.  =  y.  +  y. 

3  1  : 


(19) 


In  X .  =  y .  -y . 
1  1  D 
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Therefore  2=v.  In  can  be  replaced  in  the  formulation  by: 


Himmelblau  problem  16  has  numerous  cross  product  terms 

in  both  the  objective  functions  and  constraints.  All  cross 

product  terms  were  replaced  by  transformations  of  the  form 
2  2 

y.  -y .  .  The  effect  of  these  transformations  was  to  in- 

1  3 

crease  both  the  dimensionality  and  degree  of  difficulty  of 
problem.  Refer  to  experiments  33  and  34  in  Appendix  B. 

The  original  problem  had  nine  variables  and  13  nonlinear 
inequality  constraints.  The  transformed  problem  had  41 
variables,  13  nonlinear  inequality  constraints,  and  an 
additional  32  equality  constraints.  The  two  experiments 
differed  only  in  a  parameter  tolerance;  however,  each  con¬ 
structed  the  same  local  optimum  point.  Therefore,  when 
using  GRG,  the  analyst  should  avoid  making  product  trans¬ 
formations  because  of  the  additional  complexity  entailed. 
However,  if  only  a  linear  programming  package  with  separable 
option  is  available,  the  analyst  must  make  these  transfor¬ 
mations  . 

2 .  Transformation  of  Quadratic  Expressions  Into 
Diagonal  Form 

Every  quadratic  form  can  be  expressed  in  terms 
of  a  symmetric  matrix  Q  associated  with  its  coefficients. 

The  quadratic  programming  problem  can  be  formulated  in 
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matrix  notation  as  follows: 


•  •  •  T  ^ 

minimize  x  2. 

subject  to:  A  x  ^  b 

X  >  0 


(21) 


A  quadratic  form  is  defined  to  be  positive  definite  if 
T 

X  2.  21  is  strictly  positive  for  all  x  0;  it  is  defined  to 

T 

be  positive  semi-definite  if  x  Q  x  is  non-negative  for  all 


X. 

By  a  suitable  change  of  variables 

x=R  y.  (22) 

the  quadratic  problem  can  be  transformed  to 

...  T  T 

minimize  Z  S.  2.  Z 

subject  to:  A  R  y  y  b  (23) 

R  y  y  0_ 

T 

If  an  R  matrix  can  be  found  that  will  make  R  R 

T  T 

diagonal,  then  this  will  make  y  R  2.  .E  Z  s  sum  of  squares 
and  therefore  separable.  It  is  possible  to  diagonalize 
any  symmetric,  nonsingular  matrix  without  the  laborious 
effort  entailed  by  the  Gram-Schmidt  orthogonalization 
process  in  solving  for  eigenvalues  (see  Ref.  ^29/)  .  For  a 
symmetric  matrix  2.'  s  sequence  of  elementary  row  operations 
followed  by  the  same  sequence  of  elementary  column  oper¬ 
ations  will  diagonalize  the  matrix.  The  same  sequence  of 
operations  when  applied  to  the  identity  matrix  will  yield 
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T  T 

a  matrix  R  ,  such  that  R  2.  ^  a  diagonal  matrix. 

To  illustrate  this  point,  consider  the  following  sym¬ 


metric  matrix., 

Q.  = 


1 

2 

-3 


2 

5 

-4 


-3 

-4 

8 


(24) 


The  first  step  is  to  augment  it  with  the  identity  matrix: 


(Q,i)  = 


1 

2 

-3 


2 

5 

-4 


-3 

-4 

8 


\ 


1 

0 

0 


0 

1 

0 


0 

0 

1 


(25) 


Step  2:  pivot  on  row  1,  to  yield: 


/I 

0 


2 

1 


-3 

2 

-1 


1 

-2 


Step  3:  pivot  on  column  2,  to  yield: 


0 

1 


0 

0 

1  / 


/I 

0 

0 


0 

1 


0 

2 

-1 


1 

-2 


Step  4:  pivot  on  row  2,  to  yield: 


1 

0 


2 

-5 


-2 

7 


0 

1 

0 


1 

-2 


0 

i/ 
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step  5;  pivot  on  column  2,  to  yield  a  diagonal  matrix 


0 

0 

1  1 

1 

0 

0  ' 

T  T 

(R  £  R,  R  )  = 

0 

1 

0 

1 

‘  -2 

1 

1 

1 

0 

i  0 

0 

-5 

I  7 

-2 

1 

f 

The  actual  computation  involved  Gauss-Jordan  elimin¬ 
ation  pivoting  on  row  1,  column  1,  row  2,  column  2  in  order, 
so  as  to  reduce  all  off  diagonal  elements  to  zero.  These 
simple  operations  can  be  programmed  easily.  A  program  was 
written  for  this  study  to  determine  the  amount  of  compu¬ 
tation  time  required  to  diagonalize  matrices  of  different 
sizes.  The  results  of  the  experiments  are  presented  in  the 
next  subsection. 

Complications  arise  whenever  there  is  a  zero  element 
along  the  diagonal  in  which  case  the  Gauss-Jordan  reduction 
scheme  breaks  down.  The  algorithm  can  be  made  to  handle  the 
case  of  an  arbitrary  zero  element  along  the  diagonal,  or 
the  case  in  which  all  diagonal  elements  are  zero.  In  the 
first  case,  the  matrix  can  still  be  diagonalized  by  inter¬ 
changing  the  row  in  which  it  appears  with  the  next  lower  row 
in  which  the  diagonal  element  is  non-zero.  This  is  followed 
by  interchanging  the  corresponding  two  columns.  This  has  the 
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effect  of  moving  the  zero  element  down  the  diagonal  to  the 


next  lower  position,  allowing  normal  pivot  operations  to  be 
continued.  It  should  be  observed  that  each  iteration  of  the 
algorithm  can  introduce  a  zero  in  those  diagonal  elements 
below  the  current  pivot  row.  Whenever  a  zero  element  is 
encountered,  this  procedure  is  repeated. 

If  all  diagonal  elements  are  zero,  choose  i,j  such  that 
a^jT^  0,  and  apply  the  row  operation  R^— kR^+R^,  and  the 
corresponding  column  operation  C^— *-0^  +C^  (where means 
"is  replaced  by") ,  This  has  the  effect  of  bringing  2a^j 
into  the  a^^  diagonal  position.  This  element  can  then  be 
moved  to  the  first  diagonal  position  by  another  interchange 
of  row  and  column.  An  a . .  7^  0  must  exist  because  the  Q 
matrix  is  required  to  be  nonsingular.  The  matrix  will 
be  in  the  following  form; 


Or 


‘11 


a .  . 


a .  . 

ID 


B 


(27) 


\  I  "  / 

Here  B_  is  a  symmetric  matrix  of  order  less  than 
still  having  zeros  along  its  diagonal.  The  diagonalization 
algorithm  is  then  used  to  zero  out  the  elements  in  the  first 
row  and  column.  This  process  is  then  repeated  until,  by 
induction,  2.  is  brought  into  diagonal  form. 
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There  are  several  observations  to  be  made  regarding  this 

T 

transformation.  First,  the  matrix  R  will  be  lower  triangular 
only  if  no  zero  elements  appear  along  the  diagonal  during  ex¬ 
ecution  of  the  algorithm.  When  this  is  the  case,  R  will  be 
upper  triangular.  Second,  the  diagonal  form  resulting  from 
this  algorithm  does  not  yield  the  eigenvalues  of  the  matrix 
but  rather  a  simple  way  of  transforming  a  quadratic  form 
into  separable  form.  Third,  the  algorithm  must  include  a 
test  at  each  iteration  to  determine  if  the  diagonal  element 
in  the  pivor  row  is  equal  to  or  very  close  to  zero. 

3 .  Experiments  in  the  Diaqonalization  of  Quadratic 
Forms 

Table  3  lists  the  experiments  that  were  conducted 
using  the  diagonalization  algorithm  described  in  the  pre¬ 
vious  subsection.  Two  types  of  matrices  were  diagonalized. 

The  first  type  was  tridiagonal  in  which  the  main  diagonal 
and  the  adjacent  diagonals  all  had  non-zero  integer  elements 
from  the  interval  range  (-50,50).  The  entries  were  deter¬ 
mined  by  a  random  number  generator.  The  second  type  was  a 
random  entry  matrix  in  which  the  random  number  generator 
was  used  to  determine  not  only  the  magnitude  of  the  element, 
but  also  its  location  in  the  matrix.  For  both  types  of 
matrices,  the  size  of  matrices  tested  were  20x20,  40x40, 

60x60,  and  80x80.  Each  random  entry  matrix  size  tested 
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TABLE  3:  Application  of  a  Diagonalization  Algorithm 


A.  Tridiagonal  Matrix 


Nonzero 

Time  to 

Total  CPU 

Matrix  Size 

Elements 

Diaq.  (sec) 

Time  ( sec) 

20x20 

58 

0.12 

0.94 

40x40 

118 

1.12 

3.65 

60x60 

178 

3.93 

9.42 

80x80 

238 

9.24 

18.58 

Random  Entry 

Matrix 

Nonzero  Time  to 

Total  < 

Matrix  Size 

Class  Elements  Diaq.  (sec)  Time  ( 

20x20 

Sparse 

40  0.15 

0.98 

Medium 

200  0.13 

1.01 

Dense 

360  0.13 

1.14 

40x40 

Sparse 

160  1.12 

3.61 

Medium 

800  1.12 

4.03 

Dense 

1440  1.22 

4.93 

60x60 

Sparse 

360  4.05 

9.97 

Medium 

1800  3.95 

10.66 

Dense 

3240  3.86 

11.64 

80x80 

Sparse 

640  9.49 

18.96 

Medium 

3200  9.80 

21.55 

Dense 

5760  9.26 

22.72 
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was  further  classified  and  tested  as  sparse,  medium,  or 
dense,  depending  on  the  number  of  non-zero  elements.  The 
classification  sparse,  medium,  and  dense  was  used  to  des¬ 
cribe  matrices  having,  respectively,  10%,  50%,  and  90%  of 
its  elements  non-zero.  A  timer  routine  was  used  to  compute 
the  actual  time  spent  in  diagonalization,  since  the  larger 
matrices  took  a  proportionately  greater  time  in  generating 
matrix  elements. 

The  computational  algorithm  used  did  not  take  advantage 

of  the  degree  of  sparseness  of  the  test  matrices,  so  the 

diagonalization  time  is  essentially  the  same  for  a  given 

matrix  size.  The  time  did  not  vary  in  direct  proportion 

to  the  number  of  matrix  elements  but  instead  increased  more 

rapidly  as  the  size  increased.  In  fact,  when  the  number 

of  rows  (  and  columns)  is  increased  by  a  factor  of  k,  the 

3 

time  can  be  expected  to  increase  by  a  factor  of  k  .  For 
example  the  80x80  matrix  was  1.33  times  as  large  as  the 
60x60  matrix,  but  the  computational  time  increased  by  a 
factor  of  2.41. 

The  analyst  must  be  aware  of  the  time  and  preparation 
necessary  to  convert  a  quadratic  expression  into  diagonal 
form,  and  must  weigh  this  against  whether  to  apply  a  series 
of  product  type  transformations  in  order  to  separate  the 
variables . 
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To  conclude  this  section,  the  analyst  must  keep 
in  mind  that  transformations  to  separable  form  can  greatly 
increase  the  size  of  the  model  and  thus  make  it  less 
economical  to  solve.  The  analyst  should  also  be  aware  that 
the  next  step,  in  the  conversion  of  the  separable  form  to 
piecewise  linear  approximations  of  the  functions  involved, 
expands  the  model  further  by  introducing  special  sets  of 
variables . 
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D.  TRANSFORMATION  TO  UNCONSTRAINED  PROBLEMS 
BY  USE  OF  PENALTY  AND  BARRIER  FUNCTIONS 

A  general  nonlinear  programming  problem  can  be  reduced 

to  a  sequence  of  unconstrained  optimization  problems  by  a 

transformation  which  combines  the  objective  function  and 

constraints.  The  minima  of  the  new  unconstrained  function 

approximate  the  solution  to  the  constrained  problem. 

Exterior  point  techniques  compute  a  sequence  of  points 

generally  outside  the  feasible  region  of  the  original 

problem.  This  is  accomplished  by  addition  of  a  penalty 

term  to  the  objective  function  that  is  a  function  of  only 

the  violated  constraints.  A  useful  penalty  function  for 

inequality  constraints  of  the  form  g^(x)  ^  0  is : 

Pj (x)  =  min  (0,g^(x))^  (28) 

The  optimization  problem  becomes  the  minimization  of : 

f(x)  +  r^  ^  Pj(x)  (29) 

j=l 

where  r  is  an  appropriate  weight.  In  the  limit  as  r 

JC  JC 

becomes  large,  constrained  minima  of  the  original  problem 

are  approached  by  unconstrained  minima  of  the  transformed 

problem. 

An  important  attribute  of  the  penalty-function 
approach  is  that  the  initial  search  point  does  not  have 
to  satisfy  the  constraints.  A  simultaneous  solution  to  the 

constraint  equations  is  attained  concurrently  to  the 
attainment  of  constrained  relative  minima.  A  major 
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disadvantage  of  exterior  point  transformations  is  that  the 
transformed  problem  becomes  progressively  ill-conditioned 


as  the  penalty  function  increases.  In  addition,  the  numeric¬ 
al  errors  in  the  penalty  terms  become  significant  for  large 
penalty  coefficients.  Because  of  these  difficulties,  it 
may  be  very  difficult  to  satisfy  the  constraints  with  any 
desired  accuracy. 

Interior  point  techniques  compute  a  sequence  of 
feasible  solutions  to  the  original  problem.  The  inside 
penalty  function  establishes  a  barrier  within  the  feasible 
region  which  can  not  be  crossed  by  a  search  for  the  con¬ 
strained  minimum.  This  transformation  prevents  the 
solutions  from  violating  the  inequality  constraints  which 
are  gradually  approached  as  the  barrier  is  relaxed.  Use¬ 
ful  barrier  functions  are  ^  ^) • 

The  optimization  problem  becomes ; 

m 

minimize  f(x)  +  B .  (x)  (30) 

j=l  ^ 

where  r,  is  an  appropriate  weight  and  there  are  m  such 
inequality  constraints. 

The  starting  point  for  interior  point  techniques  must 

be  a  point  which  strictly  satisfies  the  constraints.  After 

a  minimum  has  been  obtained  for  one  value  of  r,  ,  a  new  and 

smaller  r,  is  used  in  the  next  search.  Each  constrained 
k 

relative  minimum  of  f(x)  is  approached  asymptotically  by  a 
relative  minimum  of  (30) . 
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Interior  point  methods  are  incapable  of  handling 


equality  constraints  and  are  combined  with  a  penalty  term 
to  counter  this  difficulty.  The  sequential  unconstrained 
minimization  technique  (SUMT)  of  Fiacco  and  McCormick  /3/ 
is  a  mixed  interior-exterior  penalty  function  technique. 
For  the  general  nonlinear  programming  problem: 
minimize  f(x) 
subject  to: 


gj(2.)i  0 


h  .  (2O  =  0 

3 


^  1#2#.... 

(31) 

j=  m^+1,  . . .  .m 


the  SUMT  code  minimizes  the  unconstrained  penalty  function 
transformation: 


f(x) 


In  g . (x) +  i 
^  r. 


m  2 

Z  h  (x;- 
j=m^+l 

(32) 


The  objective  function  and  inequality  constraints  can  be 
nonlinear  functions  of  the  variables  but  the  equality  con¬ 
straints  must  be  linear  functions  of  the  variables  in  order 
to  guarantee  convergence  to  the  solution  of  the  nonlinear 
programming  problem. 

There  are  several  disadvantages  associated  with  the 
mixed  interior-exterior  point  methods.  The  first  is  that 
the  Hessian  matrix  of  the  P(x, r  )  function  becomes  progressively 

**  jC 

more  ill-conditioned  as  the  minimum  is  approached,  so  search 
directions  may  be  misleading.  Second,  the  rate  of  convergence 
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is  slowed  considerably  as  the  structure  of  the  unconstrained 
problem  becomes  more  unfavorable.  The  biggest  disadvantage 
from  the  user's  standpoint  is  the  amount  of  preparation  time 
required  to  compute  the  analytical  first  and  second  deriva¬ 
tives  of  the  original  objective  function  and  constraints. 
Unless  most  of  the  derivatives  are  zero  or  constant,  this 
will  not  only  require  a  lot  of  time  but  is  also  more  prone 
to  human  error. 

References  /30/-/33/  give  good  discussions  of  barrier 
and  penalty  function  techniques.  The  excellent  treatment 
of  unconstrained  nonlinear  programming  techniques  makes 
Ref.  730/  particularly  useful  as  a  general  reference. 

Although  it  is  not  the  intent  of  this  thesis  to 
compare  the  efficiency  of  the  SUMT  and  GRG  codes,  it  should 
be  noted  that  the  SUMT  method  generally  required  a  signifi¬ 
cantly  greater  amount  of  computation  time  than  did  the  GRG 
method.  As  pointed  out  in  various  sections  of  this  chapter 
and  the  next  one,  GRG  is  a  very  versatile  code  in  that 
scaling  and  transformations  can  be  applied  easily  by 
modifying  the  user  supplied  subroutine  GCOMP.  On  the  other 
hand,  SUMT  requires  computation  of  analytical  gradient  and 
Hessian  functions  for  the  GRADl  and  MATRIX  subroutines. 

This  can  require  a  substantial  amount  of  preparation  and 
debugging  time.  However,  SUMT  does  have  an  option  that 
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enables  it  to  compute  numerical  approximations  for  the 
gradient  and  Hessian  functions  by  finite  differencing. 
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V.  THE  USE  OF  SCALING.  ROTATION.  AND 
TRANSLATION  IN  CONSTRAINED  OPTIMIZATION  PROBLEMS 


A.  USE  OF  SCALING  IN  NONLINEAR  PROGRAMMING  PROBLEMS 

The  objective  of  this  chapter  is  to  determine  how 
sensitive  the  GRG  code  is  to  scaling  in  nonlinear  programming 
problems.  The  analyst  is  responsible  for  scaling  his  problem 
and  very  real  difficulties  can  be  encountered  if  he  attempts 
to  solve  a  problem  using  GRG  without  making  an  effort  to 
scale  it  first.  Commercial  linear  programming  codes  are 
forgiving  in  the  sense  that  the  code  will  perform  row  and 
column  scaling  operations  on  the  problem  tableau.  This  is 
not  true  of  GRG. 

Although  scaling  is  important  in  both  linear  and 
nonlinear  programming  problems,  it  is  especially  critical 
in  nonlinear  programming.  Attempting  to  solve  a  linear 
programming  problem  without  scaling  is  to  run  the  risk  of 
introducing  round-off  errors  which  alter  the  original 
problem  and  may  even  result  in  a  false  optimum  being 
designated.  However,  in  the  nonlinear  programming  problem, 
if  the  problem  is  poorly  scaled,  there  is  a  good  possibility 
the  nonlinear  programming  code  used  will  be  unable  to  even 
find  a  feasible  solution,  let  alone  provide  an  optimal 
solution. 
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The  efficiency  and  rate  of  convergence  of  optimization 


methods  depends  very  critically  on  the  given  function  f(x) 
and  the  scales  used  for  the  variables.  A  great  deal  of 
caution  must  be  used  with  even  the  simplest  scale  changes, 
such  as  x^=  0,  since  many  important  aspects  of  the 

optimization  algorithm  will  not  be  left  invariant. 

To  illustrate  some  points  consider  the  following 
problem: 


(33) 


f(x) 

2 

=  ^1 

X-  , 

X..,  >  0 

1 

2  — 

=  (1 

The  function  isocontours  are  concentric  circles  in 
this  case.  The  method  of  steepest  descent  gives  the  descent 
direction  d  =  -yf(l,l)  ^  (-2,-2)^  and  the  solution  x*=(0/0)'^ 
in  one  step.  By  introducing  new  variables: 


=  0.1 


^2  2 


(34) 


the  efficiency  of  steepest  descent  is  radically  changed. 
The  isocontours  of  the  new  objective  function 
f(y)  =  100  y^^  +  ^2 

form  a  deep  and  narrow  valley  along  the  y^  coordinate,  and 
the  gradient  vector  calculated  at  the  same  point  as  before, 
d  =  -Vf(*l/l)  =  (-20,2)  makes  an  angle  of  nearly  90 
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degrees  with  the  axis.  Thus  steepest  descent  is  in¬ 
efficient  in  handling  the  rescaled  problem. 

Second  order  Newton  methods  using  the  Hessian  matrix 
do  not  have  this  drawback,  and  may  be  considered  invariant 
with  respect  to  linear  changes  of  scale  of  the  variables. 
The  direction  vector  is  reoriented  towards  the  solution  in 
the  scaled  system  of  variables.  For  the  example  of  this 
section; 

d  =  -  Hess  (.1,1)"^  \7f(.l,l)  =  - 

d  =  -  (.l,!)"^  (36) 

When  f(x)  is  a  very  complicated  function  of  its 
variables,  it  may  be  very  difficult  to  scale  the  problem. 
Intuitively,  in  general  constrained  nonlinear  optimization 
problems,  "well-scaled"  problems  are  those  in  which  similar 
changes  in  the  variables  lead  to  similar  changes  in  the 
objective  function. 

For  a  quadratic  function  of  n  variables,  it  is 

possible  to  diagonalize  the  quadratic  form  as  suggested 

in  the  preceding  chapter,  and  then  scale  the  diagonal 

T 

elements  of  the  R  Q,  R  matrix  so  that  they  are  approximately 
equal.  The  next  section  presents  and  discusses  detailed 
experiments  that  were  done  on  two  well-scaled  problems 
using  the  GRG  nonlinear  programming  code. 


.005  0 

20 

0  .5 

2 
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The  problem  of  scaling  has  another  interesting  aspect. 


-5  “2 

In  a  simple  two-dimensional  problem,  if  10  :1  ^  10  and 

-2  3 

10  ^  ^2  —  feasible  region  has  the  shape  of  a  very 


narrow  band.  It  would  be  very  difficult  to  search  for  a 
minimum  in  such  a  band  because  the  step  length  would  have 
to  be  very  short  to  avoid  violation  of  the  constraint  im¬ 
posed  by  Xj^. 


Difficulties  are  also  encountered  by  performing 


arithmetic  operations  with  numbers  of  different  orders  of 
magnitude.  To  scale  the  variables  so  that  they  are  of  the 


same  order  of  magnitude. 


new  variable  y.  defined  as  follows: 
1 


(37) 


where  y.  lies  in  the  interval  (-a.,  a.).  If  all  a.  are 
1  11  1 


taken  equal  to  one,  then  all  variables  would  be  located 
within  a  hypercube  of  length  2  about  the  origin  of  the  new 
coordinate  system. 

References  and  /,3^  contain  descriptions  of 

self-scaling  algorithms  for  unconstrained  minimization 
problems . 

B.  COMPUTATIONAL  EXPERIENCE  WITH  SCALING,  TRANSLATION, 


AND  ROTATION  OPERATIONS  ON  NONLINEAR  PROGRAMMING 
PROBLEMS 


The  effect  of  translation,  rotation,  and  scaling 
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operations  can  be  illustrated  by  the  following  sketches  of 
a  two-dimensional  quadratic  function. 

Figure  2:  Effect  of  translation,  rotation,  and  scaling 
operations  on  a  quadratic  function 


(C)  (D) 
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The  original  problem  is  transformed  to  a  new  coordinate 
system  by  translation  of  the  old  origin  to  the  minimum  of  the 
function.  The  axes  are  then  rotated  to  achieve  symmetry  of 
the  function  contours  with  respect  to  the  new  coordinate 
system.  Finally,  the  coordinates  are  scaled  to  make  the 
function  contours  circular. 

Appendix  B  enumerates  the  experiments  of  scaling, 
translation,  and  rotation  that  were  applied  to  problem  16 
and  20  using  the  GRG  code.  All  the  experiments  were  con¬ 
ducted  using  GRG  because  the  user  supplied  subroutines 
could  be  easily  modified  by  the  simple  addition  of  a  few 
Fortran  statements,  and  the  appropriate  adjustment  of  the 
input  vector. 

The  experiments  on  scaling  were  done  in  a  reverse 
sense,  that  is,  given  a  fine-tuned  properly  scaled  test 
problem  to  begin  with,  at  what  point  would  different  scale 
magnitudes  affect  the  ability  of  GRG  to  find  a  solution. 

Refer  again  to  Appendix  C  for  a  description  of  GRG.  The 
authors  of  that  code  feel  that  proper  scaling  of  variables 
and  functions  is  critical  to  success  of  the  code.  They 
further  recommend  that  constraints  be  scaled  to  have 
absolute  values  below  100.  There  are  no  printouts  of  the 
gradients  of  the  constraints  and  objective  function,  but 
the  user  should  suspect  scaling  problems  if  there  are 
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large  values  in  the  reduced  gradient  array  which  prints  out 
with  final  solution  information.  These  values  should  all 
be  approximately  zero  when  the  program  terminates. 

The  experiments  were  run  by  reading  in  an  initial 
vector  multiplying  the  components  of  by  appropriate 
scaling  factors  to  get  a  new  vector  x  which  was  used  to 
evaluate  the  constraints  and  objective  function.  In 
successive  iterations  the  code  tries  to  optimize  the  y. 
vector,  and  in  so  doing,  should  construct  the  equivalent 
optimum  x  vector.  The  starting  point  was  modified  so  that 
the  same  starting  x  vector  is  used  in  the  functional 
evaluations.  There  was  no  way  to  use  constraint  values 
to  predict  scaling  difficulties  in  these  experiments.  The 
test  problems  were  adjusted  to  account  for  scaling  factors 
in  such  a  manner  that  the  same  starting  point  was  always 
used.  Consequently,  the  initial  constraint  values  were 
always  the  same . 

In  experiments  35  and  38  on  problem  16,  the  initial 

components  of  the  y  vector  had  the  value  0.01,0.1,  or  1. 

Both  experiments  ran  to  the  correct  optimum.  Likewise, 

experiment  39  resulted  in  the  correct  optimum  with  the 

initial  components  of  y  each  being  0.1.  In  experiment  36, 

the  Y  vector  initially  had  components  of  .0001  or  1, 

-7 

causing  the  initial  step  size  to  be  .8  (10  ),  which  was 
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quickly  reduced  to  zero  in  an  attempt  to  satisfy  the  con¬ 
straints.  This  caused  the  program  to  terminate.  Experiment 

37  produced  the  same  result.  The  initial  components  were 

—6 

either  .001,  .01,  or  Ir  the  initial  step  size  was  (10  ) 

which  was  quickly  reduced  to  zero  causing  program  termination. 
It  should  be  noted  in  the  experiments  where  the  scaling 
factors  did  not  prevent  GRG  from  finding  the  optimum  point, 
the  time  required  to  solve  the  problem  increased  over  the 
reference  time  of  2.90  seconds. 

Problem  20  proved  to  be  a  much  harder  problem  to 
solve  when  using  scale  factors.  Its  sensitivity  is  probably 
due  to  the  equality  constraints,  and  the  ratios  of  one 
variable  to  a  weighted  sum  of  several  variables  appearing 
in  most  of  the  constraints.  In  experiments  49  through  52, 
although  only  moderate  scale  factors  were  used,  each  pro¬ 
gram  eventually  terminated  without  finding  a  feasible  point 
when  the  step  size  was  reduced  to  zero.  In  each  of  these 
experiments,  what  was  most  noticeable  was  the  difference 
in  magnitude  of  the  components  of  the  reduced  gradient. 

The  largest  difference  occurred  when  all  variables  were 
scaled  by  the  same  factor.  The  scale  factor  of  10  produced 

g 

reduced  gradient  elements  as  large  as  10  while  terms  as 

5 

large  as  10  occurred  when  the  scale  factor  was  2.  At  the 
optimum  point,  all  elements  of  the  reduced  gradient  should 
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equal  zero.  Only  in  experiment  53,  where  a  scale  reduction 
was  tried,  was  GRG  successful  in  reaching  the  optimum  point, 
taking  about  1  second  longer  than  the  reference  run. 

Translation  of  coordinates  was  tried  in  experiment 
40  on  problem  16  and  in  experiment  54  on  problem  20.  The 
simple  translation  made  was: 


X.  =  y.  +  B. 
1  1  1 

.  th 


1™  1, ....,n 


(38) 


where  y^  is  the  i  component  of  the  input  vector,  and 
th 

x^  is  the  i  component  of  the  vector  used  in  the  functional 
evaluations.  Both  problems  were  solved  easily,  each  taking 
approximately  one  second  longer  to  solve  than  the  corres¬ 
ponding  reference  run.  Translation  did  not  appear  to  be 
important  in  these  experiments.  A  case  where  it  might  be 
of  use  is  in  a  goal  programming  model  formulated  for  use 
with  GRG  in  which  each  decision  variable  has  the  same  goal. 

Finally,  a  number  of  pairwise  rotation  experiments 
were  tried  on  problems  16  and  20.  These  comprise  experi¬ 
ments  41  through  48,  and  experiments  55  through  64.  Single 
pair  rotations  were  of  the  form: 


x.=  y.-y. 
1  1  j 

x.=  y.+y. 

3  1  3 


(39) 


where,  as  previously,  y^  is  a  component  of  the  input  vector 
and  x^  is  a  component  of  the  vector  used  in  evaluating 
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constraints  and  the  objective  function.  All  of  these  single 
pair  rotations  worked  although  the  global  optimum  was  not 
always  reached.  In  general,  experiments  on  problem  16  took 
about  0.5  seconds,  and  experiments  on  problem  20  about  5.0 
seconds  longer  than  the  corresponding  reference  runs. 

In  experiment  47,  a  multiple  pair  rotation  was  tried 
on  problem  16  and  was  successful  'in  reaching  the  global 
optimum.  The  final  experiment  on  problem  16  was  experiment 
48  in  which  the  input  vector  y.  premultiplied  by  an 
arbitrary  matrix  consisting  of  +  I's,  -I's,  and  O's.  A 
Gauss-Jordan  reduction  program  was  used  to  determine  the 
initial  y  vector  to  provide  the  same  starting  x  vector. 
Again,  GRG  solved  the  problem  quickly  with  the  correct 
optimum. 

Experiments  60  to  62  were  multiple  paiirwise 
rotations  of  the  y  coordinates,  but  each  failed  to  provide 
a  feasible  point.  The  probable  cause  of  failure  was  the 
difficulty  of  satisfying  the  12  nonlinear  equality  con¬ 
straints.  Two  or  more  of  these  constraints  were  violated 
in  each  experiment.  Experiment  63  was  another  multiple 
pair  rotation,  in  which  4  of  the  single  pair  combinations 
from  previous  good  experiments  was  tried.  This  experiment 
resulted  in  a  non-optimal  solution.  The  final  experiment 
tried  was  pairwise  rotation  of  12  pairs  of  variables 


68 


(treated  consecutively),  but  it  too  failed  to  determine  a 
feasible  point. 

To  sum  up  the  results  of  the  experiments  in  this 
section,  it  can  be  stated  that  scaling  is  a  significant 
factor,  affecting  the  ability  of  GRG  to  find  the  feasible 
region,  and  then  to  optimize  the  function  successfully.  It 
is  hard  to  give  guidelines  for  scaling  a  constrained  non¬ 
linear  program,  but  a  first  step  would  be  to  convert  the 
coefficients  in  the  objective  function  and  constraints  to 
the  same  order  of  magnitude.  If  the  code  fails  to  find  a 
feasible  point,  look  at  the  values  of  the  constraints  to 
see  if  any  exceed  the  recommended  figure  of  +  100.  Finally, 
check  the  reduced  gradient  in  the  solution  information,  to 
see  if  there  are  any  large  components.  At  the  optimum 
point,  these  values  should  be  essentially  zero. 

The  experiments  with  translation  and  rotation  oper¬ 
ations  indicate  that  these  operations  make  it  harder  for 
the  GRG  code  to  find  the  optimum  point,  and  thus  are  not 
recommended. 
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APPENDIX  A 


Test  Problems  Used  with  GRG  and  SUMT  Codes 

A.  HIMMELBLAU  PROBLEM  16 

Source;  J.D.  Pearson,  On  Variable  Metric  Methods  of 
Minimization,  Research  Analysis  Corporation 
Report.  RAC-TP-302,  MCLean,  Virginia,  May 
1968. 

Number  of  variables:  9 

Number  of  constraints;  13  nonlinear  inequality  constraints 

1  upper  bound 

Objective  function; 

Maximize;  f(x)=  0.5(x-x.“x^x-+x_x„-x_x-+x_x  -x.x  ) 

—  i4zJjyoy-)ob/ 


Constraints : 

> 

0 

0 

1  2  2 
1-x  -x^ 

5  6 

> 

0 

1-Xi"  . 

■ 

> 

0 

1“  (^1-^5)^ 

-  (=2-Xg)^ 

> 

0 

1-  (Xj^-X^)^ 

-  (Xj-Xg)^ 

> 

0 

1- 

- 

> 

0 

1-  (x^-x^)^ 

- 

> 

0 

1-x 

2  ,  N  2 

7  -^^8-^9^ 

> 

0 

^1^4“^2^3 

> 

0 

^3^9 

> 

0 

"^5^9 

> 

0 

^5^8 "^6^ 7 

> 

0 

^9 

> 

0 
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starting  point: 


X .  =1  i=  1, . ,9 

1 

f(x)=0 


Optimum  point: 


2C*=( 


0.9971, -0.0758,0.5530,0.8331,0.9981, -0.0623 
0.5642,0.8256,  0.0000024)’^ 


i 


f(x*)=  0.8660 


B.  HIMMELBLAU  PROBLEM  4 


Source:  J.  Bracken  and  G.P.  McCormick,  "Selected 

Applications  of  Nonlinear  Programming, "  John 
Wiley  and  Sons,  Inc.,  New  York,  1968. 


Number  of  variables:  10 

Number  of  constraints;  3  linear  equality  constraints 

10  bounds  on  independent  variables 


Objective  function: 
Minimize:  f(x)  = 


10 

I 


1=1. 


X  . 
1 


Constraints : 


h^(x)  .  x^+2x2+2X3+Xg+Xj_g-2=0 

h. (x)  =  x.+2x  +x^+x_-l  =0 
2—'  4  5  6  7 

h3(x)  =  X3+x.^+Xg+2Xg+Xj^g-l=0 

x^  ^0  i=l, . ,10 


Starting  Point:  x^=0.1  i=l, . ,10 

f(x)=  -20.961 
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Optimum  Point; 

x*=( 0.0406, 0.1477, 0.7832, 0.0014, 0.4853, 0.0007 
0.0274,0.0180,0.0375,0.0969) 

f(x*)=  -  47.761 

Data : 

Cj^=-6.089  C2=-17.164  C2=-34.054  c^=-5.914 

c^=-24.721  Cg=-14.986  c^=-24.100  Cg=-10.708 

Cg=-26.662  c^q=-22.179 

C.  HIMMELBLAU  PROBLEM  20 

Source:  D.A.  Paviani  Ph.D  dissertation.  The 

University  of  Texas,  Austin,  Texas,  1969 

Number  of  variables:  24 

Number  of  constraints:  12  nonlinear  equality  constraints 

2  linear  equality  constraints 
6  nonlinear  inequality  constraints 
24  bounds  on  independent  variables 

Objective  function; 

24 

Minimize  f(x)=  Z  a.x. 

i=l  ^  ^ 


Constraints : 


h^(x) 


X 


(  ^+12^ 


Ci  Xi 


24  ,, 

'(i+12)  t 

j=13  bj 


=  0  i=l. 


12 

40b .  ^  X . 

12.  -J. 

3=i  b. 


h^3(x) 


24 

E 

i=l 


X.  -1=0 
1 


12 
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1.671  =  0 


u  .X  12  ^24 

”14'^’  =  5:  ^  *  £ 

i=l  i 


X  .  _ 

1 

i=13 


where 


X .  +  X 

~  L" 


f=(0. 7302(530)  • 


14.7 


40 


(i+12) 


24 

j  =  l 


+  e.  >  0  i=l,2,3 

1  ~ 


X  . 

3 


^(i+3)  +^(i+15) 


24 


+  e .  >  0  i=4, 5, 6 

1  ~ 


£  X. 

j=l 


X .  >  0  i=l, . ...»  24 
1  ~ 


Starting  Point:  x^=0.04  i=l,....,24 

f(x)  =  0.14696 

Optimum  Point: 


X*  = 

(0.0,0 

.1072,0. 

1114,0.0,0.0,0.0,0. 

0755,0. 

0.0,0 

.0, 0.0,0 

.0112,0.0,0.1928,0. 

2886,0. 

0.0,0 

.0,0.2129,0.0,0.0,0.0,0.0,0 

.0004) 

f(x*) 

0.055658041 

Data: 

i 

a . 

1 

b. 

1 

c .  d . 

1  1 

e . 

1 

1 

0.0693 

44.094 

123.7  31.244 

0.1 

2 

0.0577 

58.12 

31.7  36.12 

0.3 

3 

0.05 

58.12 

45.7  34.784 

0.4 

4 

0.20 

137  .4 

14.7  92.7 

0.3 

5 

0.26 

120.9 

84.7  82.7 

0.6 

73 


Data ;  Cont ' d 


i 

a  . 

b. 

c  . 

d. 

e . 

X 

1 

1 

1 

1 

6 

0.55 

170.9 

27.7 

91.6 

0.3 

7 

0.06 

62.501 

49.7 

56.708 

8 

0.10 

84.94 

7.1 

82.7 

9 

0.12 

133.425 

2.1 

80.8 

10 

0.18 

82.507 

17.7 

64.517 

11 

0.10 

46.07 

0.85 

49.4 

12 

0.09 

60.097 

0.64 

49.1 

13 

0.0693 

44.094 

14 

0.0577 

58.12 

15 

0.05 

58.12 

16 

0.20 

137.4 

17 

0.26 

120.9 

18 

0.55 

170.9 

19 

0.06 

62.501 

20 

0.10 

84.94 

21 

0.12 

133.425 

22 

0.18 

82.507 

23 

0.10 

46.07 

24 

0.09 

60.097 
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APPENDIX  B 


Experiments  Conducted  with  SUMT  and  GRG  Codes 
A.  VARIABLE  TRANSFORMATIONS  USING  GRG-HIMMELBLAU  PROBLEM  16 
Experiment  Number  1 

f(x^*)=  .43305879  iterations=8  CPU=1.71  sec. 

x*=  ( .8660, .4999, .8660, .4999,0.0,0.0, .8660,1.5000,1.0) 

Experiment  Number  2 

-4 

TOL  parameter  changed  in  DEGEN  subroutine  to  10 
f(x*)~  .86603619  iterations=17  CPU=2.90  sec. 

x*=(  0.0, 0.0, .8660, -.5, 0.0, -1.0, .8660, .5,1.0) 

Experiment  Number  3 

Lower  bounds  placed  on  all  variables 
f(x*)=  .50000076  iterations=10  CPU=2.18  sec. 

x*=(  .8660, .5000, .5006, .8657,0.0,0.0,  .8660,1.5,1.0) 

Experiment  Number  4 

Lower  bounds  on  all  variables 

2 

Transformation;  x.=y. 

11 

f(x*)=  .50001561  iterations=20  CPU=3.56  sec. 

x*=( 1.0, .9948, 0.0, 1.0, 0.0, 1.0, 0.0, .9948,1.0) 

Experiment  Number  5 

Run  experiment  number  3  from  optimum  generated  by  SUMT 
f(x*)=  .86602621  iterations=ll  CPU=1.98  sec. 

x*=  (  .9306, .7071, .0254,1.0, .9308, .7067,0.0,1.0,0.0) 
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Experiment  Number  6 


Run  experiment  number  4  from  optimum  generated  by  SUMT 
f(x^)=  .86601997  iterations=5  CPU=1.38  sec. 

x*=  ( .8660, .5000,0.0,1.0, . 8660 ., 5000 , 0 . 0 , 1 . 0 , 0 . 0) 


Experiment  Number  7 

Transformation:  x.=  y.  I  ,y.  >  0 

1  -'ll  1  — 

f(x*)=  .50000076  iterations=10  CPU=2.33  sec. 

x*=  ( .8660, .5000, .5000, .8657,0.0,0.0,8660,1.5,1.0) 


Experiment  Number  8 

Transformation:  x.= 

1 

f(x*)=  .43301447 
x*=  { .8660, .5000, 


^i 


;  no  lower  bounds 
iterations=12  CPU=2.47  sec. 


,8660, .5000,  0.0, 0.0,  .8660,1.5,1.0) 


Experiment  Number  9 

y 

Transformation:  x.=e  i;  no  lower  bounds 

1 

f{2i*)=  .4833425  iterations=22  CPU=5.77  sec. 

x*=  (  .9845,.2034,  .08,  .9968,  .0215,  .0613,  .3403,  .9682,  .0279) 

Experiment  Number  10 

y 

Transformation:  x.=e  i  ,y.  >0 

1  1  ~ 

No  feasible  point  found 
Experiment  Number  11 

.  2 

Transformation:  x.=  sin  y. ;  no  lower  bounds 

1  1 

No  feasible  point  found 
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Experiment  Number  12 


Transformation:  x.=sin 

1 

No  feasible  point  found 


>  0 


Experiment  Number  13 

2 

Transformation:  x.=sin  y. ;  no  lower  bounds 

1  1 

(  .7854,  .7854, . ,.7854) 

f(x*)=  .50035409  iterations=12  CPU=2 . 59  sec. 

x*=  ( .9898, .1548, .1485, .9889, .0017, .0008, .2533, .8311, 
.2971) 


Experiment  Number  14 

Transformation:  x.=  y. 1  ;  no  lower  bounds 

1  I  1 1 

Start  from  reported  optimum  (see  Appendix  A) 

No  feasible  point  found 

Experiment  Number  15 

Transformation:  e  ;  no  lower  bounds 

X .  =  - 

..  -yi 

e  +  e  ^ 

No  feasible  point  found 

B.  VARIABLE  TRANSFORMATIONS  USING  GRG-HIMMELBLAU  PROBLEM  20 
Experiment  Number  16 

f(x*)=  .055658041  iterations=30  CPU=13.56  sec. 

x*=  ( .0.0, .1072, .1114,0.0,0.0,0.0, .0755,0.0,0.0,0.0, 
0.0,  .0112,0.0, .1928, .2886,0.0,0.0,0.0, .2129,0.0, 
0.0, 0.0, 0.0, .0004) 
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Experiment  Number  17 


Lower  bounds  changed  to  inequality  constraints 


f(x*)=. 055658019 

iterations=28 

CPU=23.03 

sec . 

Experiment  Number  18 

2 

Transformation:  x.=y.  ; 

1^1 

V .  >  0 

1  ~ 

f(x*)=. 055668194 

iterations=47 

CPU=28.25 

sec  • 

Experiment  Number  19 

Transformation;  x.= 

1 

^i 

;  y.  >  0 

1  — 

f(x*) =.05565887 

iterations=70 

CPU=40.11 

sec  • 

Experiment  Number  20 

Transformation:  x.= 

1 

^i 

1  ;  no  lower 

bounds 

No  feasible  point  found 

Experiment  Number  21 

Transformation:  x.=sin^ 

1 

^i 

f(x*)=  .055672184 

iterations=5l 

CPU=37.40 

sec . 

Experiment  Number  22 

Transformation:  x.=e^^, 

1 

y.  >  0 

1  — 

Z=  (-3.281 . 

,-3. 

.281) 

Did  not  run  because  ; 

0 

^  violated  lower 

bounds 

Experiment  Number  23 

Transformation:  x.=e^i, 

1  ' 

no  lower  bounds 

No  feasible  point  found 
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Experiment  Number  24 


Trans  formation : 

X  .  = 
1 


e 


e 


Y 

1 


Yi 


+ 


no  lower  bounds 


No  feasible  point  found 


C.  VARIABLE  TRANSFORMATIONS  USING  GRG  CODE-HIMMELBLAU 
PROBLEM  4 

Experiment  Number  25 


f(x*)=  -47.606887  iterations=9  CPU=1.68  sec. 

x*=( .1278, .1678, .6454, .0033, .4838, .0018, .0273, .0303,  . 
.0265, .2439) 


Experiment  Number  26 

y  * 

Transformation:  x,=e  ^ 

f(x*)=-47. 751577  iterations=12  CPU=2.62  sec. 

x*={ .0270, .1465, .7820, .0038, .4854, .0035, .0219, .0159, 
.0339, .1124) 


Experiment  Number  27 


Transformation:  x.= 

1 

f(x*)=  47.606887 


,  y.>  0 
1— 


iterations=9 


CPU=1.59  sec. 


x*=( .1278, .1678, .6454, .0033, .4838, .0018, .0273, .0303, 
.0265, .2439) 


D.  VARIABLE  TRANSFORMATIONS  USING  SUMT  CODE 

Experiment  Number  28  HIMMELBLAU  PROBLEM  16 

f(x*) =.8660277  67  points  CPU=11.69  sec. 

x*=(-.4924,-.3562, . 5076, -.8616, -.4924, -.8703, .5076, 
-.3475, .5141) 
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Experiment  Number  29  HIMMELBLAU  PROBLEM  16 
Place  lower  bounds  on  all  variables 

f(2S.*)=. 8660196  59  points  CPU=15.29  sec. 

x*=( .9753, .2208, .2965, .9550, .9753, .2208, .2964, .9550, 

0.0) 


Experiment  Number  30  HIMMELBLAU  PROBLEM  16 

Transformation:  x.=y.^ 

11 

f(x*)=  .8660178  70  points  CPU=15.78  sec. 

x*=  (.8660, .4999, .0016, .9998, .8660, .4985, .0003,1.0,0.0) 

Experiment  Number  31  HIMMELBLAU  PROBLEM  20 
f(x*) =.06520525 

Program  was  terminated  after  four  minutes  of  CPU  time 
without  having  reached  optimum  point. 


Experiment  Number  32  HIMMELBLAU  PROBLEM  4 

.  y  • 

Transformation:  x,=e  i 

1 

f(x*)=  -47.76488  57  points  CPU=13.22  sec. 


E. 


PRODUCT  TRANSFORMATIONS  USING  GRG  CODE -HIMMELBLAU 
PROBLEM  16 


Experiment  Number  33 

2  2 

Trans  formation :  y .  -y .  =x . x . 

I  D  ID 

f(x*)~  .43302779  iterations=12 

x*=  ( .8660, .5000, .8660, .5000,0.0,0.0, 

0, . ) 


CPU=33.5l  sec. 
.8660,1.5000,1. 
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Experiment  Number  34 


Transformation:  y.  -y.  =x.x. 

I  D  ID 

-4 

TOL  parameter  changed  in  DEGEN  subroutine  to  10 

f(x*) =.43302779  iterations=12  CPU=33.7  sec. 

x*=( .8660, .5000, .8660, .5000,0.0,0.0, .8660,1.5000,1.0, 

. ) 


F.  SCALING  EXPERIMENTS  USING  GRG-HIMMELBLAU  PROBLEM  16 


Experiment  Number  35 


Scaling  Factors: 

X, — >.x_:  100 
1  3 


1 


f(x*) =.86603674 


iterations=24  CPU=4.5  sec. 


Experiment  Number  36 

Scaling  Factors: 

X- — ».x_:  10000  X, _ 1 

15  6  9 

No  feasible  point  found 

Experiment  Number  37 
Scaling  Factors: 

X, — ».x_  :  1000  X. — >.x  . :  100  x^ — ♦•x.  :  1 

1  3  4  6  7  9 

No  feasible  point  found 

Experiment  Number  38 
Scaling  Factors: 

X- — vX_:  1  x^ — vx.:  10  x  — :  100 

1  3  4  6  7  9 

f(x*)=  .86604509  iterations=19  CPU=3.24  sec. 
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Experiment  Number  39 


Scaling  Factors 

X, — >  X- :  10  x^—^x^:  10  x^_>x^;  10 

1  3  4  6  7  9 

f(x*)=  .86600928  iterations=17  CPU=3.47  sec. 


G.  TRANSLATION  EXPERIENT  USING  GRG-HIMMELBLAU  PROBLEM  16 

Experiment  Number  40 

Translation:  x=v  +0.5 
1  1 

y°=(0.5,0.5, _ ,0.5) 

f(x*)=  .86602981  iterations=19  CPU=3.88  sec. 


H.  ROTATION  EXPERIMENTS  USING  GRG-HIMMELBLAU  PROBLEM  16 
Experiment  Number  41 


Xi=yi-y2 


X2=y+y2 


f(x*)=. 86601039 


iterations=18  CPU=3.37  sec. 


Experiment  Number  42 


^3=^3’^4 


^4=^3-^^4 


f(x*)=. 86605021 


iterations=18 


CPU=3.17  sec, 


Experiment  Number  43 
''5=^5"^6 


^4=^5'^^6 


f(x*)=. 86603622 


iterations=19 


CPU=3.39  sec, 
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Experiment  Number  44 


f(x*) =. 50000327  iterations=16  CPU=3.24  sec. 


Experiment 

Number 

45 

^8=^8 

■^9 

^9=^8 

% 

f(x*) 

=.5001803 

iterations 

=21 

CPU 

=3.49 

sec . 

Experiment 

Number 

46 

xi=yi 

-^9 

X9=yi 

^^9 

f  (x*) 

=.86604409 

iterations 

=18 

CPU 

=3.20 

sec . 

Experiment 

Number 

47 

xi=yi 

-^2 

x^=y^- 
3  -^3 

^4 

=^5=y5-y 

6 

X7=y 

7”^9 

X2=yi 

+^2 

='4=y3+y4 

-6=y5+y 

6 

7+^9 

f  (x*) 

=.86600559 

iterations 

=18 

CPU 

=3.13 

sec. 

Experiment 

Number 

48 

x=£  y 

1 

-1 

-1 

0  0 

0 

1 

0 

-1 

1 

1 

0 

-1  0 

0 

0 

0 

0 

1 

1 

1 

0.  -1 

0 

0 

0 

0 

0 

1 

0 

1  0 

0 

0 

0 

0 

P  = 

0 

0 

1 

0  1 

0 

0 

0 

0 

0 

0 

-1 

0  0 

1 

0 

0 

0 

0 

0 

0 

-1  0 

0 

1 

0 

0 

I 

0 

0 

0 

0  0 

0 

0 

1 

0 

0 

-1 

1 

0  -1 

0 

0 

0 

f 

f  (x*) 

\ 

=.86602427 

iterations= 

20 

CPU= 

3.25 
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I.  SCALING  EXPERIMENTS  USING  GRG-HIMMELBLAU  PROBLEM  20 
Experiment  Number  49 
Scaling  Factors: 


No  feasible  point  found 


Experiment  Number  50 
Scaling  Factors: 

^x, ^  :  10  X 


X. 


12 


13“^^24  *  ^ 


No  feasible  point  found 


Experiment  Number  51 

Scaling  Factors: 

X, — ^  x_  :  100  x^ 
1  8  9 

No  feasible  point  found 


X 


16 


:  10  X 


17 


24 


Experiment  Number  52 
Scaling  Factors: 


X, 


Xg  =  2 


X, 


ei  *  2  X  :  2 

16  17  24 


No  feasible  point  found 


Experiment  Number  53 
Scaling  Factors: 


Xi^Xg  :  0.1  Xg. 


f(x*)=. 055658041 


Xi6  :  0.1  •  0-1 


i terations=29  CPU=14.76  sec. 
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J.  TRANSLATION  EXPERIMENT  USING  GRG-HIMMELBLAU  PROBLEM  20 

Experiment  Number  54 

Translation:  x.=y.+  .04 

11 

Z=  (0.0, . ,0.0) 

f(x*) =.055658041  iterations=30  CPU=15.01  sec. 


K.  ROTATION  EXPERIMENTS  USING  GRG-HIMMELBLAU  PROBLEM  20 

(All  lower  bounds  converted  to  inequality  constraints) 

Experiment  Number  55 


>=2=i'i+y2 


f(x*) =.055658019 


iterations=26  CPU=19.61  sec. 


Experiment  Number  56 


^4=^3-"^4 
f(x*)=. 055658019 


iterations=27  CPU=19.89  sec, 


Experiment  Number  57 


^5=^5’^6 


^6=^5-^^6 


f(x*)=. 076972898 


iterations=24  CPU=15.93  sec, 
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Experiment  Number  58 


^7=^7-^8 


f(x*)=. 055658019 


iterations=29  CPU=20.45  sec. 


Experiment  Number  59 


^9=^9 "^10 


^10=V^10 

f(x*)=. 055658019 


iterations=27  CPU=19.05  sec. 


Experiment  Number  60 


Xi=yi-y2 


^3=^3 -^4 


^5=^5"^6  ^7=^7"^8 


^2=^1^^2  ^4=^3-^^4  ^6=^5'^^6  ^8'^7'^^8 

No  feasible  point;  constraints  violated  =4,  9 


Experiment 

Number  61 

^13=^13-^14 

^15=^15-^16 

^io=^9*yio 

='i2=i'n^yi2 

^14=^13'^^14 

^16=^15‘^^16 

No  feasible  point;  constraints  violated:  8,9,11,12 


Experiment  Number  62 


^21=^21-^22 

^23  ^23''^24 

18”^17'''^18  ^20“^19'''^20 

^2  2 -^21 '*’^2  2 

X24=y23+y24 

No  feasible  point;  constraints  violated:  4,9,11,12 
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Experiment  Number  63 


Xi-yi-Yj 


==3=5^3-i'4 


^7  ^7  ^8  ^9  ^9  ^10 


='4=^3'"^4  ='8'i'7% 


f(x*)=. 11085991 


iterations=25  CPU=17.37  sec. 


Experiment  Number  64 

Pairwise  Rotation  of  all  24  variables 
No  feasible  point  found;  constraint  violated;  4 
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APPENDIX  C 


Generalized  Reduced  Gradient  Method 

The  generalized  reduced  gradient  method  is  an  algorithm 
that  solves  nonlinear  programming  problems  of  the  form; 

minimize  f(x) 

(40) 

subject  to;  h(x)=0,  a  ^  x  ^  b 

where  h(x)  is  of  dimension  m. 

The  reduced  gradient  method  was  originally  proposed  by 
Wolfe  for  problems  with  linear  constraints  and  was 

generalized  to  handle  nonlinear  constraints  by  Abadie  and 
Carpentier  /37/.  The  material  in  this  appendix  is  based  on 
Himmelblau  /IW . 

Inequality  constraints  are  adjusted  to  the  formulation 
above  by  the  introduction  of  non-negative  slack  variables. 

The  slack  variables  are  added  to  the  set  of  n  variables. 

If  a  nondegeneracy  assumption  holds,  the  GRG  algorithm  par¬ 
titions  the  variables  into  two  distinct  sets.  One  set  con¬ 
sists  of  m  basic,  dependent  variables,  Xj*  The  other  set 
comprises  (n-m)  nonbasic,  independent  variables,  x  , 

At  each  iteration,  the  reduced  gradient  method  con¬ 
siders  the  problem  only  in  terms  of  the  independent  variables. 
Since  the  dependent  variables  are  determined  implicitly  by 
the  independent  variables,  the  objective  function  is  a 
function  of  the  (n-m)  independent  variables. 
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The  basic  idea  can  be  illustrated  by  the  following  ex¬ 


ample  ; 

minimize  f(x^,X2) 

subject  to:  h(Xj^,X2)=0 

For  differential  displacements  in  x^^  and  x^ 
df(x)=^f(x)  dx,  +c^(x)  dx^ 

dh(^=c3h(x)  dx,  +  ^h(x)  dx^  =  0 

Solve  dh(x)=  0  for  dx^ 
dx  =  -c^h(x)/ax  ^ 


(41) 


(42) 


(43) 


and  introduce  dx^  into  the  differential  objective  function 


df(x)= 


^f(x)  -  (^f(x)  ^h{x) 

^x^  Sx^ 


dx. 


(44) 


to  yield  the  reduced  gradient; 

-1 


(x)  _  (x)  _  ^f(2i) 


dx. 


^  X, 


^h(x) 


3  X 


2  J 


^h(x) 

^x. 


(45) 


One  necessary  condition  for  f(x)  to  be  minimum  is: 


2i<^=  0 

dx. 


(46) 


Thus  the  generalized  reduced  gradient  can  be  expressed  as : 


af(2c)_  \f  _  V, 


T 


6^ 


K 


K 


ah 

ax 


K/ 


(47) 
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As  an  example  of  this  method  consider  the  following  problem: 

...  22  2.2^- 
minimize  x,  +x^  +  x_  +  x.  -2x,-3x. 

1  2  3  4  1  4 


subject  to;  2:^^yi^-\ryi^+yi^=l 


X-  +X.+2X-+X .=6 
12  3  4 


X .  >  0 , 

1  “ 


i=l,2,3,4 


Select  basic  variables.  Then 


=  1 

[2x 

r  2 

V.  %  =  1 

[2x 

-  .  2X.-3  1 

K 

L 

3 

4  J 

\-l 

^  =  r2 

1 

'ah 

"  1  -1 

[1 

1 

t 

-1  2 

ah  = 


1  4 

2  1 


(48) 


The  reduced  gradient  becomes 

C^f(2i)  _ 


a  X 


2X2 »  2x^-3|  -  I2x^-  2,  2x, 


U  1 


K 


(49) 


Simplifying  and  setting  the  reduced  gradient  equal  to 
zero  yields; 


2Xj^  “^^2  ■‘■2^3“2=0 

-6x-+  4x.  +2x.+3=0 
12  4 


(50) 


Given  the  feasible  point  x=(2,2,l,0)  the  reduced 
gradient  equals  (-8,-1).  This  indicates  that  at  this  point, 
x^  and  x^  are  increased  together  in  the  ratio  of  eight  to 
one.  As  they  increase,  x^  and  x^  increase  in  such  a  way  as 
to  keep  the  constraints  satisfied. 
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The  algorithm  continues  as  long  as  the  reduced  gradient 
is  not  zero.  For  each  new  iteration  the  x_  vector  is  changed 
by  the  relation: 

(k+i)  ^  (k)  o:  <k) 

-  x  +  _  a  (51) 

rY(k)  (k) 

where  2^  ' ^  0  is  the  step  length,  and 'is  the  direction 

vector  for  the  next  iteration.  The  elements  of  the  direction 

vector  are  determined  differently  for  the  independent  and  the 

dependent  variables. 

The  search  directions  for  the  independent  variables  are 
determined  as  follows: 

f 


d. 

D 


(k) 


(k)  ^  (k)  ,, 

0  ,  if  X .  =  b .  and  z .  >0 

=K  0  »  if  x^  =  a^  and  <  0  (52) 

,  if  a'  <  X  (k?  ,  ^  ' 

i,  3  j  j  ^ 

where  z.  is  the  j  element  of  the  reduced  gradient;  a.  is 
D  : 

the  lower  bound  on  x . ;  and  b .  is  the  upper  bound  on  x . . 

D  J  D 

The  search  directions  for  the  dependent  variables  are 
determined  by  linearizing  the  constraints: 


d 

~x. 


(k)  ^  _ 


-1 

fail\ 

(k) 


(53) 


The  step  size  parameter  is  determined  by  a  unidimensional 
dichotomous  search.  If  the  step  size  and  step  direction  com¬ 


bination  cause  some  elements  of  x 


(k) 


-I 


to  be  infeasible,  (de- 


''  (k) 

noted  by  x^  )  then  the  dependent  variables  are  adjusted 

to  obtain  a  feasible  x  ,  if  at  the  point  (x.^^^^^,  x  ) 

“I  “K  1 
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the  constraints  are  not  satisfied,  a  first  order  Taylor 


series  approximation  is  made,  and  the  resulting  expression 

(k+1) 


is  solved  for  x^ 


(k+i)  (k+i) 

^  '  -I 


as  follows: 


(k+1)  A  (k+i) 

X  / 

-K  -I 


+  ah 


(k+1)  ^  (k+l)\ 

K  I 


% 


(k+1)  ^  (k+l)| 


)=. 


% 


(k+1)  A  (k+1) 


■  % 


4k 

<^21-, 


-1 


(k+1)  A  (k+l)\ 

/  (54) 


X.  •  ■  ,  X. 

“K  “ 


This  last  expression  is  termed  an  iteration  by  Newton's 
method  and  continues  until  one  of  the  following  outcomes 
occurs : 

(1)  If  the  last  point  obtained  is  feasible  and  there  has 
been  an  improvement  in  the  objective  function,  the 
Newton  method  is  terminated  and  the  search  is  con¬ 
tinued  starting  with  equation  (51). 

(2)  If  the  last  point  obtained  is  feasible,  but  the  ob¬ 
jective  function  has  worsened,  the  step  size  is  re¬ 
duced  by  some  fraction  and  the  Newton  method  is 
repeated. 

(3)  If  the  interations  by  Newton’s  method  do  not  converge 
in  a  fixed  number  of  iterations,  the  step  size  is 
reduced  by  some  fraction  and  the  Newton  method  is 
repeated. 

(4)  If  the  last  point  obtained  is  infeasible,  a  change  in 
basis  is  carried  out. 

As  can  be  seen  from  equation  (51),  if  the  step  size 

is  reduced  to  zero  during  any  search  iteration  or  during 
execution  of  the  Newton  method,  the  algorithm  will  terminate. 
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